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Abstract 



The aim of this work is to put forward a statistical mechanics theory of social inter- 
action, generalizing econometric discrete choice models. After showing the formal equiv- 
alence linking econometric multinomial logit models to equilibrium statical mechanics, a 
multi-population generalization of the Curie- Weiss model for ferromagnets is considered 
as a starting point in developing a model capable of describing sudden shifts in aggregate 
human behaviour. 

Existence of the thermodynamic limit for the model is shown by an asymptotic sub- 
additivity method and factorization of correlation functions is proved almost everywhere. 
The exact solution of the model is provided in the thermodynamical limit by finding con- 
verging upper and lower bounds for the system's pressure, and the solution is used to prove 
an analytic result regarding the number of possible equilibrium states of a two-population 
system. 

The work stresses the importance of linking regimes predicted by the model to real 
phenomena, and to this end it proposes two possible procedures to estimate the model's 
parameters starting from micro-level data. These are applied to three case studies based 
on census type data: though these studies are found to be ultimately inconclusive on an 
empirical level, considerations are drawn that encourage further refinements of the chosen 
modelling approach. 
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Chapter 1 



Introduction 



In recent years there has been an increasing awareness towards the problem of finding a 
quantitative way to study the role played by human interactions in shaping the kind of 
aggregate behaviour observed at a population level: reference [3] provides a comprehensive 
account of how ramified this field of study already is. There the author reviews efforts 
made by researchers from areas as diverse as psychology, economics and physics, to cite a 
few, in the pursuit of regularities that may characterize different kinds of aggregate human 
behaviour such as urban traffic, market behaviour and the internet. 

The idea of characterizing society as a unitary entity, characterized by global features 
not dissimilar from those exhibited by physical or living systems has accompanied the devel- 
opment of philosophical thought since its very beginning, and one must look no further than 
Plato's Republic to find an early example of such a view. The proposal that mathematics 
might play a crucial role in pursuing such an idea, on the other hand, dates back at least to 
Thomas Hobbes's Leviathan, where an attempt is made to draw analogies between the laws 
describing mechanics, and features of society as a whole. Hobbes's work gives an inspiring 
outlook on the ways in which modern science might contribute to practical human affairs 
from an organizational point of view, as well as technological. 

In later centuries, nevertheless, quantitative science has grown aware of the fact that, 
though a holistic view such as Hobbes's plays an important motivational role in the develop- 
ment of new scientific enterprises, it is only by reducing a problem to its simplest components 
that success is attained by empirical studies. One of the interesting sub-problems singled 
out by the modern approach is that of characterizing the behaviour of a large groups of 
people, when each individual is faced with a choice among a finite set of alternatives, and 
a set of motives driving the choice can be identified. Such motives might be given by the 
person's personal preferences, as well as by the way he interacts with other people. My 
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thesis aims to contribute to the research effort which is currentiy anaiysing the role played 
by social interaction in the human decision making process just described. 

As early as in the nineteen-seventies the dramatic consequences of including interaction 
between peers into a mathematical model of choice comprising large groups of people have 
been recognized independently by the physical [23], economical [62] and social science |34] 
communities. The conclusion reached by all these studies is that mathematical models have 
the potential to describe several features of social behaviour, among which the sudden and 
dramatic shifts often observed in society trends [U], and that these are unavoidably linked 
to the way individual people influence each other when deciding how to behave. 

The possibility of using such models as a tool of empirical investigation, however, is 
not found in the scientific literature until the beginning of the present decade [21]: the 
reason is to be found in the intrinsic difficulty of establishing a methodology of systematic 
measurement for social features. Confidence that such an aim might be an achievable one 
has been boosted by the wide consensus gained by econometrics following the Nobel prize 
awarded in 2000 to economist Daniel Mcfadden for his work on probabilistic models of 
discrete choice, and by the increasing interest of policy makers for tools enabling them to 
cope with the global dimension of today's society [39} [27], 

This has led very recently to a number of studies confronting directly the challenge of 
quantitatively measuring social interaction for bottom-up models, that is, models deriving 
macroscopic phenomena from assumptions about human behaviour at an individual level 

[mielKSIKM]. 

These works show an interesting interplay of methods coming from econometrics |25j . 
statistical physics |26j and game theory [l3] , which reveals a substantial overlap in the basic 
assumptions driving these three disciplines. It must also be noted that all of these studies 
rely on a simplifying assumption which considers interaction working on a global uniform 
scale, that is on a mean field approach. This is due to the inability, stated in [69], of existing 
methods to measure social network topological structure in any detail. It is expected that it 
is only matter of time before technology allows to overcome this difficulty: in the meanwhile, 
one of the roles of today's empirical studies is to assess how much information can be derived 
from the existing kind of data such as that coming from surveys, polls and censuses. 

This thesis considers a mean field model that highlights the possibility of using the 
methods of discrete choice econometrics to apply a statistical mechanical generalization of 
the model introduced in [21]. The approach is mainly that of mathematical-physics: this 
means that the main aim shall be to establish the mathematical properties of the proposed 
model, such as the existence of the thermodynamical limit, its factorization properties, and 
its solution, in a rigorous way: it is hoped that this might be used as a good building 
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block for later more refined theories. Furthermore, since maybe the most problematic 
point of a mathematical study of society lies in the feasibility of measuring the relevant 
quantities starting from real data, two estimation procedures are put forward: one tries to 
mimic the econometrics approach, while the other stems directly from equilibrium statistical 
mechanics, by stressing the role played by fluctuations of main observable quantities. These 
procedures are applied to some simple case studies. 

The thesis is therefore organised as follows: the first chapter reviews the theory of Multi- 
nomial Logit discrete choice models. These models are based on a probabilistic approach 
to the psychology of choice [H] , which is chosen here as the modelling approach to human 
decision making. In this chapter we focus on the mathematical form of Multinomial Logit, 
and in particular on its equivalence to the statistical mechanics of non-interacting particles. 
In the second chapter we consider the Curie- Weiss model, of which we provide a treatment 
recently developed in the wider study of mean field spin glasses [37], which allows to give 
elegant rigorous proofs of the model's properties. In chapter three we generalise results from 
chapter two for a system partitioned into an arbitrary number of components. Since such 
a model corresponds to the generalization of discrete choice first considered in [21], which 
includes the effect of peer pressure into the process decision making, it provides a potential 
tool for the study of social interaction: chapter four shows an application of this to three 
simple case studies. 
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Chapter 2 



Discrete choice models 



In this chapter we describe the general theory of discrete choice models. These are econo- 
metric models that were first applied to the study of demand in transportation systems in 
the nineteen-seventies [6]. When people travel they can choose the mode of transportation 
between a set of distinct alternatives, such as train or automobile, and the basic tenet of 
these models is that such a discrete choice can be described by a probability distribution, 
and that proposals for the form of such distribution can be derived from principles estab- 
lished at the level of individuals. As we shall see this modus operandi is one familiar to 
statistical mechanics, and corresponds to what is commonly known as a bottom-up strategy 
in finance. 

After describing the general scope of discrete choice analysis, in section [2?3l we describe 
precisely the mathematical structure of one of the most widely used discrete choice models, 
the Multinomial Logit model. Here we shall see how the probability distribution describ- 
ing people's choices arises from the assumption that individual act trying to maximize the 
benefit coming from that choice, which is the common setting of neoclassical economics. 
Discrete choice models, in general, ignore the effect of social interaction, but we shall see in 
subsection 12.3.31 that the Multinomial Logit can be rephrased precisely as a statistical me- 
chanical model, which gives an ideal starting point for extending such a model of behaviour 
to a context including interaction, to be considered in later chapters. 

Due to his development of the theory of the Multinomial Logit model economist Daniel 
McFadden was awarded the Nobel Prize in Economics in 2000 [50], for bringing economics 
closer to quantitative scientific measurement. The purpose of discrete choice theory is to 
describe people's behaviour: it is an econometric technique to infer people's preferences from 
empirical data. In discrete choice theory the decision-maker is assumed to make choices 
that maximise his/her own benefit. Their 'benefit' is described by a mathematical formula. 



4 



Table 1. Prediction Success Table, Journey-to-Work 
(Pre-BART Model and Post-BART Chains) 



Cell Counts 






Predicted Choices 






Actual Chokes 


Auto Alone 


Carpool 


Bus 


BART 


Tots) 


Auto Alone 


255.! 


79.1 


28.5 


15.2 


378 


Carpool 


74.7 


37.7 


15.7 


8.9 


137 


Bus 


12.S 


16,5 


42.9 


4.7 


77 


BART 


9.« 


111 


6,9 


1L.2 


39 


Total 


352.4 


144.5 


94.0 


40.0 


631 



Predicted Sliare 
(Std. Enerj 



55.3% 
(11.4%) 



22-9% 
(10.7%) 



2i.TA 



14.9% 

(3.7%) 

12.2% 



(Source: McFadden 2001) 



6.3% 

(i.5%j 

«.2% 



98% 

agreement 



Figure 2.1: Discrete choice predictions against actual use of travel modes in San Francisco, 1975 
(source: McFadden 2001) 

a utility function, which is derived from data cohected in surveys. This utility function 
includes rational preferences, but also accounts for elements that deviate from rational 
behaviour. 

Though discrete choice models do not account for 'peer pressure'or 'herding effects', it 
is nonetheless a fact that the standard performance of discrete choice models is close to 
optimal for the analysis of many phenomena where peer influence is perhaps not a major 
factor in an individual's decision: Figure 12.11 shows an example of this. The table (taken 
from [50]) compares predictions and actual data concerning use of travel modes, before and 
after the introduction of new rail transport system called BART in San Francisco, 1975. 
We see a remarkable agreement between the predicted share of people using BART (6.3%), 
and the actual measured figure after the introduction of the service (6.2%). 



2.1 General theory 

In discrete choice each decision process is described mathematically by a utility function, 
which each individual seeks to maximize. The principle of utility maximization is one which 
lies at the heart of neoclassical economics: this has often been critised as too simplistic an 
assumption for complex human behaviour, and this criticism has been supported by the 
poor performance of quantitative models arising from such an assumption. It must be 
noted however, that if we wish to attain a quantitative description of human behaviour at 
all, we must do so by considering a description which is analytically treatable. There exist of 
course alternatives approaches (e.g. agent-based modeling), but since this field of research 
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is still in its youth, it pays to consider possible improvements of utility maximisation before 
abandoning it altogether. This is indeed the view taken by discrete choice, which sees people 
as rational utility maximizers, but also takes into account a certain degree of irrationality, 
which is modeled through a random contribution to the utility function. 

As an example, a binary choice could be to either cycle to work or to catch a bus. The 
utility function for choosing the bus may be written as: 

U = V + e (2.1) 

where V, the deterministic part of the utility, could be symbolically parametrised as follows 

y = ^ Aax^ + a^ya (2.2) 

a a 

The variables Xa are assumed to be attributes regarding the choice alternatives them- 
selves. For example, the bus fare or the journey time. On the other hand, the ya may 
socio-economic variables that define the decision-maker, for example their age, gender or 
income. It is this latter set of parameters that allows us to zoom in on specific geographical 
areas or socio-economic groups. The Xa and are parameters that need to be estimated 
empirically, through survey data, for instance. The key property of these parameters is 
that they quantify the relative importance of any given attribute in a person's decision: the 
larger its value, the more this will affect a person's choice. For example, we may find that 
certain people are more affected by the journey time than the bus fare; therefore changing 
the fare may not influence their behaviour significantly. The next section will explain how 
the value of these parameters is estimated from empirical data. It is an observed fact [491 [2] 
that choices are not always perfectly rational. For example, someone who usually goes to 
work by bus may one day decide to cycle instead. This may be because it was a nice sunny 
day, or for no evident reason. This unpredictable component of people's choices is accounted 
for by the random term e. The distribution of e may be assumed to be of different forms, 
giving rise to different possible models: if, for instance, £ is assumed to be normal, the re- 
sulting model is called a probit model, and it doesn't admit a closed form solution. Discrete 
choice analysis assumes e to be extreme- value distributed, and the resulting model is called 
a logit model [6]. In practice this is very convenient as it does not impose any significant 
restrictions on the model but simplifies it considerably from a practical point of view. In 
particular, it allows us to obtain a closed form solution for the probability of choosing a 
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particular alternative, say catching a bus rather than cycling to work : 



P=^, (2.3) 



(see section 12.31 for the derivation) . 

In words, this describes the rational preferences of the decision maker. As will be 
explained later on, (j2.3p is analogous to the equation describing the equilibrium state of 
a perfect gas of heterogeneous magnetic particles (a Langevin paramagnet): just like gas 
particles react to external forces differently depending, for instance, on their mass and 
charge, discrete choice describes individuals as experiencing heterogeneous influences in 
their decision-making, according to their own socio-economic attributes, such as gender and 
wealth. A question arises spontaneously: do people and gases behave in the same way? The 
answer to such a controversial question is that in some circumstances they might. Models 
are idealisations of reality, and equation (j2.3p is telling us that the same equation may 
describe idealised aspects of both human and gas behaviour; in particular, how individual 
behaviour relates to macroscopic or societal variables. These issues go beyond the scope of 
this thesis, but it is important to note that ()2.3p offers a mathematical and intuitive link 
between econometrics and statistical mechanics. The importance of this 'lucky coincidence' 
cannot be overstated, and some of the implications will be discussed later on in more detail. 



2.2 Empirical estimation 

Discrete choice may be seen as a purely empirical model. In order to specify the actual 
functional form associated with a specific group of people facing a specific choice, empirical 
data is needed. The actual utility function is then specified by estimating the numerical 
values of the parameters and aa which appear in our definition of V given by (j2.2p , thus 
establish the choice probabilities p.3p . As mentioned earlier, these parameters quantify 
the relative importance of the attribute variables Xa and i/a- For example, costs are always 
associated with negative parameters: this means that the higher the price of an alternative, 
the less likely people will be to choose it. This makes intuitive sense: what discrete choice 
offers is a quantification of this effect. Once the data has been collected, the model parame- 
ters may be estimated by standard statistical techniques: in practice. Maximum Likelihood 
estimation methods are used most often (see, e.g., [6] chapter 4). We shall see in further 
chapters how, though optimal for standard discrete choice models, Maximum likelihood es- 
timation seems to be unsuitable for phenomena involving interaction due to discontinuities 
in the probability structure. As we shall see, a valuable alternative is given by a method 
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put forward by Joseph Berkson [7]. 

Discrete choice has been used to study people's preferences since the seventies [50] , 
Initial applications focused on transport |68l I53| . These models have been used to develop 
national and regional transport models around the world, including the UK, the Netherlands 
|24| . as well as Copenhagen [S^. Since then discrete choice has also been applied to a range 
of social problems, for example healthcare [301 159j. telecommunications [42] and social care 

m 

2.3 The Multinomial logit model 

The binomial logit model which gives the probabilities (|2.3p can be seen as a special case of 
the Multinomial Logit model introduced by R. Duncan Luce in 1959 [18] when developing a 
mathematical theory of choice in psychology, and was later given the utility maximization 
form which we describe here by Daniel Mcfadden [50| . 

In the following three subsections we shall describe the mathematical structure of a 
Multinomial Logit model. In the first subsection we shall first give information about the 
Gumbel extreme-distribution, which is the distribution by which the model describes the 
random contribution e to a person's utility, and is chosen essentially for reasons of analytical 
convenience. The second subsection uses the properties of Gumbel distribution in order to 
derive the probability structure of the model. These two sections are an 'executive summary' 
of all the main things, and they can be found on any standard book on econometrics [6ll25]. 

The third subsection gives the statistical mechanical reformulation of the Multinomial 
Logit model, by showing that the same probability structure arises when we compute the 
pressure of a suitably chosen Hamiltonian: this leads the way for the extensions of the 
model that shall be considered in later chapters. 

2.3.1 Properties of the Gumbel distribution 

In order to implement the modelling assumption of utility maximization in a quantitative 
way, we need a suitable probability distribution for the random term e. 

The Multinomial Logit Model models randomness in choice by a Gumbel distribution, 
which has a cumulative distribution function 

= exp{-e-^(^-'')}, fi > 0, 
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and probability density function 



fix) = /ie-^(^-'') exp{-^(x - 7])}. 
We have that if e = Gumbel(r/, fi) then 

2 

E{e)=r] + 1, Var(e) = -^, 

where 7 is the Euler-Mascheroni constant (= 0.577). 

The Gumbel distribution is a type of extreme-value distribution, which means that 
under suitable conditions it gives the limit distribution for the value of the extremum of a 
sequence of i.i.d random variables, just like the Gaussian distribution does for their average 
under the central limit theorem. In econometrics the Gumbel distribution is used for mainly 
analytical reasons, since it has a number of interesting properties, which make it suitable 
as a modeling tool. As we shall see in subsection 12.3.31 the model that one obtains can be 
readily mapped into a statistical mechanical model, thus establishing an interesting link 
between economics and physics. 

The following two properties regard Gumbel variables with equal variance, and hence 
equal /i (see [6], pag. 104). 

I. If e' = Gumbel(?7i, /i) and e" = Gumbel(772 , A*) are independent random variables, then 
£ = e' — e" is logistically distribute with cumulative distribution 

Fe{x) ^ 



1 ^ g-fiim-m-^) ' 
and probability density 

llp-^J■(v2-vl-^) 

feix) 



(1 _|_ Q-li{v2-Vi-x)Y ' 

II. If Ej = Gumbel(ryj, /i) for 1 ^ i ^ are independent then 



1 

max Ei = Gumbel f- In , u) 

i=i..k V 



As we said, the logit is a model which is founded on the assumption that individuals choose 
their behaviour trying to maximize a utility, or a "benefit" function. In the next section 
we shall use Property II to handle the probabilistic maximum of the utilities coming from 
many different choices, whereas Property I shall be used to compare probabilistically the 
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benefits of two different dioices. 



2.3.2 Econometrics 

We sfiall now derive the probability distribution for an individual / choosing between k 
alternatives i = l..k. We have that choice i yields / a utility: 

We assume that I chooses the alternative with the highest utility. However, since these 
are random we can only compute the probability that a particular choice is made: 

Pi,i = P{^^ I chooses i " ) 

This is in fact the probability that U^^^ is bigger than all other utilities, and we can 
write this as follows: 



pi^, = P(U '^ ^ m&KUn = P( y/'^ + ef> ^ max( V}'' + e 'O) 



Now define 

By property II of the Gumbel distribution, 

[/* = Gumbel(- In J] e^^/'^ , /x) 



So, if 



we have that [/* = y* + e* with e* = Gumbel(0, 
This in turn gives us that 



1 + e'^(^*-^i^") 



by property I of the Gumbel distribution, and this can be re-expressed as 



ta(0 T/C) 



Pi,i 
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According to econometric knowledge // is a parameter which cannot be identified from 
statistical data. From a physical perspective, this corresponds to the lack of a well defined 
temperatm'e: intuitively this makes sense, since measuring temperature consists in compar- 
ing a system of interest with another system whose state we assume to know perfectly well. 
In physics this can be done to a high degree of precision: in social systems, however, such 
a concept has yet no clear meaning, and finding one will most certainly require a change in 
perspective about what we mean by measuring a quantity. 

As a practical consequence, in this simple model we have that we can let the parameter 
/i be incorporated into the degrees of freedom V^^^ of the various utilities, and get the choice 
probabilities in the following form: 



2.3.3 Statistical mechanics 

As we have seen, the Multinomial logit model follows a utility-maximization approach, 
in that it assumes that each person behaves as to optimize his/her own benefit. From 
a statistical-mechanical perspective, this amounts to the community of people trying to 
identify its ground state, where some definition of self-perceived well-being, the utility, 
takes the role traditionally played by energy. 

If there were an exact value of the utility corresponding to each behaviour, a system 
characterized by such maximizing principle for the ground state would identify microcanon- 
ical ensemble in a equilibrium statistical mechanics. This in amounts to stating that the 
energy of the system has an exact value, as opposed to being a random variable. 

However, since the Multinomial logit defines utility itself as a Gumbel random variable 
in order to try and capture both the predictable and unpredictable components of human 
decisions, its "ground state" turns out to be a "noisy" object. Statistical mechanics models 
this situation by defining a so-called canonical ensemble, where all possible values of the 
energy are considered, each with a probability given by a Gibbs distribution, which weights 
energetically favourable states more than unfavourable ones. We will now see how the Gibbs 
distribution leads to a model which is formally equivalent to the Multinomial logit arising 
from the Gumbel distribution. 

Assume that we have a population of N people, each of whom makes a choice 



Pi,i 



(0 




a 



W = 



= ei 
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where vectors e; form the fc-dimensional canonical basis 

ei = (1,0,..,0), ea = (0,1,..,0), etc. 
We have then that a particular state of this system can be described by the following 

set: 

a = {<T«,...,aW} 

Now define v^^^ as a A;-dimensional vector giving the utilities of the various choices for 
individual I: 

We have that V-^''\ which is the deterministic part of the utility considered in the last 
section, changes from person to person, and that it can be parametrised by a person's 
social attributes, for instance. For the moment, however, we just consider them as different 
numbers, since the exact parametrization doesn't change the nature of the probability 
structure. 

If we now denote by the scalar product between the two vectors, we may express 

the energy (also called Hamiltonian) for the Multinomial Logit Model as follows: 

N 

1=1 

Intuitively, a Hamiltonian model is one where the defines a model where the favoured 
states a are the ones which make the quantity Hj^f small, which due to the minus sign, cor- 
respond to people choosing as to maximise their utility. Most of the information contained 
in an equilibrium statistical mechanical model can be derived from its pressure, which is 
defined as 

Pat = In 2^ e ~ , 
a 

which acts as a moment generating function for the Gibbs distribution 

-Hjv(a) 

and can recover many of the features of the model, among which the probabilities pi^i, as 
derivatives of Pjy with respect to suitable parameters. 

This distribution is chosen in physics since it is the one which maximises the system's 
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entropy at a given temperature, which in turn just means that it is the most Hkely distri- 
bution to expect for a system which is at equihbrium. This is not to say that using such a 
model corresponds to accepting that society is at equihbrium, but rather to beheving that 
some features of society might have small enough variations for a period of time long enough 
to allow a quantitative study. As pointed out in a later chapter, this belief has at least some 
quantitative backing if one considers the remarkable findings made by Emile Durkheim as 
early as at the end of IQ*'^ century |20j . 

We will now show that this model is equivalent to the Multinomial Logit by computing 
its pressure explicitly and finding its derivatives. Indeed, since the model doesn't include 
interaction this is a task that can be done easily for a finite A^: 

= ln5^e-''"^^^=lnj^exp{^^;«.^(0} = 

Z Z 
= In exp 1 1;^^^ • a W}...5]exp{t;W.aW} = 

N k N k 

= innE«^p{^/'^} = Ei^E«^p{^/'^}- 

1=1 1=1 1=1 i=l 

Once we have the pressure P/v it's easy to find the probability pi^ that person I chooses 
alternative k, just by computing the derivative of Pn with respect to utility f/'^ 



Pi I = P{"1 chooses i ") 



(0 ^k v'-'^ ' 
1 '° 

which is the same as (|2.4p . 

This shows how the utility maximization principle is equivalent to a Hamiltonian model, 
whenever the random part of the utility is Gumbel distributed. There is a simple inter- 
pretation for this statistical mechanical model: it is a gas of N magnetic particles, each of 
which has k states, and the energy of these states depend on the corresponding value of the 
utility V^^'^ , which therefore bears a close analogy to a magnetic field acting on the particle. 

This model may seem completely uninteresting, since it is in no essential way different 
from a Langevin paramagnet. What is interesting, however, is how such a familiar, if trivial, 
model has arisen independently in the field of economics, and there are a few simple points 
to be made that can emphasize the change in perspective. 

First, we see how for this model it makes sense to consider the pressure P/v as an 
extensive quantity. This is due to the fact that these models are applied to samples of data 
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that yield information about each single individual, rather than be applied to extremely 
large ensembles of particles that we regard as identical, and of which we measure average 
quantities. Second, the availability of data about individuals {microeconomic data) allows 
us to define the vector v^^^ which assigns a benefit value to each of the alternative that 
individual / has. 

The main goal of an econometric model of this kind is then to find the parametrization for 
v^^^ in terms of observable socio-economic features which fits micro data in an optimal way. 
The main goal of statistical mechanics is, on the other hand, to find a microscopic theory 
capable of generating laws that are observed consistently over a large number of experiments 
and measured with extreme precision at a macroscopic level. Since the numbers available 
for microeconomic data are not as high as the number of particles in a physical systems, 
but these that are more detailed at the level of individuals, the goal of a model of social 
behaviour could be seen as an interesting mixture of the above. 

2.4 The role of statistical mechanics 

We have see how discrete choice can be given a statistical mechanical description: in this 
section we consider why this is of interest to modeling social phenomena. 

A key limitation of discrete choice theory is that it does not formally account for social 
interactions and imitation. In discrete choice each individual's decisions are based on purely 
personal preferences, and are not affected by other people's choices. However, there is a 
great deal of theoretical and empirical evidence to suggest that an individual's behaviour, 
attitude, identity and social decisions are infiuenced by that of others through vicarious 
experience or social infiuence, persuasions and sanctioning [Hll]. These theories specifically 
relate to the interpersonal social environment including social networks, social support, role 
models and mentoring. The key insight of these theories is that individual behaviours and 
decisions are affected by their relationships with those around them - e.g. their parents or 
their peers. 

Mathematical models that take into account social influence have been considered by 
social psychology since the '70s (see [63] for a short review). In particular, influential works 
by Schelling [62] and Granovetter [3l] have shown how models where individuals take into 
account the mean behaviour of others are capable of reproducing, at least qualitatively, 
the dramatic opinion shifts observed in real life (for example in flnancial bubbles or during 
street riots). In other words, they observed that the interaction built into their models was 
unavoidably linked to the appearance of structural changes on a phenomenological level in 
the models themselves. 
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Figure 2.2: The diagram illustrates how the inclusion of social interactions (right) leads to the 
existence tipping points. By contrast models that do not account for social interactions cannot 
account for the tipping points. 

Figure 12.21 compares the typical dependence of average choice with respect to an at- 
tribute parameter, such as cost, in discrete choice analysis (left), where the dependence is 
always a continuous one, with the typical behaviour of an interaction model of Schelling or 
Granovetter kind (right), where small changes in the attributes can lead to a drastic jump 
in the average choice, reflecting structural changes such as the disappearing of equilibria in 
the social context. 

The research course initiated by Schelling was eventually linked to the parallel devel- 
opment of the discrete choice analysis framework at the end of the '90s, when Brock and 
Durlauf [21] suggested a direct econometric implementation of the models considered by 
social psychology. In order to accomplish this. Brock and Durlauf had to delve into the im- 
plications of a model where an individual takes into account the behaviour of others when 
making a discrete choice: this could only be done by considering a new utility function 
which depended on the choices of all other people. 

This new utility function was built by starting from the assumptions of discrete choice 
analysis. The utility function reflects what an individual considers desirable: if we hold 
(see, e.g., [10]) that people consider desirable to conform to people they interact with, we 
have that, as a consequence, an individual's utility increases when he agrees with other 
people. 

Symbolically, we can say that when an individual i makes a choice, his utility for that 
choice increases by an amount Jij when another individual j agrees with him, thus defining 
a set of interaction parameters Jij for all couples of individuals. The new utility function 
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for individual i hence takes the following form: 

j a a 

where the sum J2j ranges over all individuals, and the symbol tj is equal to 1 if j agrees 
with i, and otherwise. 

Analysing the general case of such a model is a daunting task, since the choice of another 
individual j is itself a random variable, which in turn correlates the choices of all individuals. 
This problem, however, has been considered by statistical mechanics since the end of the 
IQ*'^ century, throughout the twentieth century, until the present day. Indeed, the first 
success of statistical mechanics was to give a microscopic explanation of the laws governing 
perfect gases, and this was achieved thanks to a formalism which is strictly equivalent to 
the one obtained by discrete choice analysis in (j5.4p . 

The interest of statistical mechanics eventually shifted to problems concerning interac- 
tion between particles, and as daunting as the problem described by ()2.5p may be, statistical 
physics has been able to identify some restrictions on models of this kind to make them 
tractable while retaining great descriptive power as shown, e.g., in the work of Pierre Weiss 
[70] regarding the behaviour of magnets. 

The simplest way devised by physics to deal with such a problem is called a mean 
field assumption, where interactions are assumed to be of a uniform and global kind. This 
leads to manageable closed form solution and a model that is consistent with the models of 
Schelling and Granovetter. Moreover, this assumption is also shown by Brock and Durlauf 
to be closely linked to the assumption of rational expectations from economic theory, which 
assumes that the observed behaviour of an individual must be consistent with his belief 
about the opinion of others. 

By assuming mean field or rational expectations we can rewrite ()2.5p in the tamer form 

^7, = Jm + ^AaX»+5^aay(^)+£, (2.6) 
a a 

where m is the average opinion of a given individual, and this average value is coupled to 
the model parameters by a closed form formula. 

If we now define Vi to be the deterministic part of the utility, similarly as before, 

Vi = Jm + Y, Aaxi^) + Yl "'^^^'^ ' 

a a 
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we have that the functional form of the choice probabihty, given by 15.41 



1 + e^' ' 



(2.7) 



remains unchanged, allowing the empirical framework of discrete choice analysis to be used 
to test the theory against real data. This sets the problem as one of heterogeneous inter- 
acting particles, and we shall see in the next two chapters how such a mean-field model, 
just like the standard Multinomial Logit, can be given a Hamiltonian statistical mechanical 
form, and solved in a completely rigorous way using elementary mathematics, via methods 
recently developed in the context of spin glasses [37] . 

Though the mean field assumption might be seen as a crude approximation, since it 
considers a uniform and fixed kind of interaction, one should bear in mind that statistical 
physics has built throughout the twentieth century the expertise needed to consider a wide 
range of forms for the interaction parameters Jjj, of both deterministic and random nature, 
so that a partial success in the application of mean field theory might be enhanced by 
browsing through a rich variety of well developed, though analytically more demanding, 
theories. 

Nevertheless, an empirical attempt to assess the actual descriptive and predictive power 
of such models has not been carried out to date: the natural course for such a study would 
be to start by empirically testing the mean field picture, as it was done for discrete choice 
in the seventies (see Figure 1), and to proceed by enhancing it with the help available from 
the econometrics, social science, and statistical physics communities. Two recent examples 
of empirical studies of mean-field models can be found in [65] and |29j . 
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Chapter 3 

The Curie- Weiss model 



The Curie- Weiss model was first introduced in 1907 by Pierre Weiss |70j as a proposal for 
a phenomenological model capable of explaining the experimental observations carried out 
by Pierre Curie in 1895 [18], concerning the dependance on temperature of the magnetic 
nature for metals such as iron, nickel, and magnetite. 

Iron and nickel are materials capable of retaining a degree of magnetization, which 
we call spontaneous magnetization, after having been exposed to a magnetic field: such 
materials are said to be ferromagnetic, from the Latin name for iron. However, it had been 
known since the day of Faraday ([18]. pag. 1) that these materials tend to lose their ability 
to retain magnetization as their temperature increases. 

Pierre Curie's experiments showed not only that the loss of the ferromagnetic property 
indeed occurs, but also observed that it occurs in a very peculiar fashion. For each of the 
materials he considered, he found a definite temperature at which spontaneous magnetiza- 
tion vanishes abruptly, giving rise to an irregular point in the graph plotting spontaneous 
magnetization versus temperature (see Figure [3TT]) : we now call this temperature the Curie 
temperature for the given material. 
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Figure 3.1: Pierre Curie's measurements in 1895 
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Figure 3.2: Pierre Weiss's measurements (crosses) fitted against his theoretical prediction (line) in 
1907: the graph shows the dependance of spontaneous magnetization on temperature for magnetite 

Weiss's model arises from physical considerations about the nature of magnetic interac- 
tions between atoms: he claims that single atoms must experience, as well as the external 
field, a sum of all the fields produced by all the other particles inside the material. He calls 
this field a "molecular field" (champ moleculaire) , and by adding a term corresponding to 
this field inside the balance equation derived by Paul Langevin to describe paramagnetic 
materials (that is, magnetic materials that do not retain magnetition after exposure to a 
field), he formulates a balance equation for ferromagnetic materials. 

In his 1907 paper Weiss shows that the theoretical predictions of his model show re- 
markable agreement with physical reality by fitting them against measurements, carried on 
by himself, on a ellipsoid made of magnetite (Figure 13. 2p . 

Today we know that the Curie- Weiss is not completely accurate: indeed, it is well 
known that some physically measurable quantities for ferromagnetic materials, called critical 
exponents, are not predicted correctly by it (see [H], pag. 425). The subsequent study of 
more detailed models, such as the Ising model, has brought to light the reason for such a 
mismatch: when rewritten in the language of modern statistical mechanics, the model of 
Curie- Weiss readily shows to be equivalent to one where all particles are interacting with 
each other. This turns out to be too strong an assumption for a system where all particles 
sit next to each other geometrically and which interact, according to quantum mechanics, 
up to a very short range. On the other hand though, the Ising model, which still makes use 
of all of Weiss's other simplifying assumptions about interaction between particles, manages 
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to predict critical exponents correctly, just by assuming that particles only interact with 
their nearest neighbours on a regular lattice, though, from a mathematical point of view, 
this modification implies a drastic reduction of the symmetry of the problem, which has so 
far proved to be analitically untreatable in more than two dimensions (see |41] pag. 341). 

All objections standing, it is nevertheless worth remembering that the degree of agree- 
ment between theory and reality for the Curie- Weiss model is truly remarkable given the 
simplicity of the model. Today, Weiss's "molecular field" assumption is called a mean field 
assumption, and scientific wisdom tells that this assumption is of great value in exploring 
the phase structure of a system so that, when faced with a new situation, one would try 
mean field first ([H], pag. 423). 



3.1 The model 

As a modern statistical mechanics model, the Curie- Weiss model is defined by its Hamilto- 
nian: 

N N 

H{cr) = - ^ JijCTiCTj - ^ hiai . (3.1) 

i,j=l i=l 

We consider Ising spins, fjj = ±1, subject to a uniform magnetic field hi = h and to 
isotropic interactions Jj.j = J/2N, so that we have. 

J N N 
H{<t) = -— -hY^cji. (3.2) 

i,j=l i=l 

If we now introduce the magnetization of a configuration a as 

N 

N 



i=l 

we can rewrite the Hamiltonian per particle as: 

^ = -im{af-hm{a) (3.3) 

The established statistical mechanics framework defines the equilibrium value of an 
observable /(c) as the average with respect to the Gibbs distribution defined by the Hamil- 
tonian. We call this average the Gibbs state for /(o"), and write it explicitly as: 
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The main observable for our model is the average value of a spin configuration, i.e. the 
magnetization, m{a), which explicitly reads: 



i=l 

Our quantity of interest is therefore (m): to find it, as well as the moments of many 
other observables, statistical mechanics leads us to consider the pressure function: 



It is easy to verify that, once it's been derived exactly, the pressure is capable of gener- 
ating the Gibbs state for the magnetization as 



3.2 Existence of the thermodynamic hmit 

We show two ways of computing the existence of the thermodynamic limit in the Curie- Weiss 
model. The first method follows [5] in exploiting directly the convexity of the Hamiltonian 
in order to prove subadditivity in N for the systems's pressure. 

The second method consists in a refinement of the first, and covers models for which 
the Hamiltonian is not necessarily convex, such as the two-population model considered in 
the next chapter. It is important to point out that a careful application of this method to 
the Sherrington-Kirkpatrick spin glass model allowed Guerra [36] to prove the twenty- years 
standing question concerning existence of thermodynamic limit. 

3.2.1 Existence by convexity of the Hamiltonian 

We consider a system of N spins defined as above. Following j£j we split the system in two 
subsystem of A^i and spins, respectively, with A^i + = For each of these systems 
we define partial magnetizations 




a 
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which ahow us to define partial Hamiltonians 

= -Ni{^ml + hmi) and Hn,^ = "-^2(^^-2 + hm2). 
We have by definition that 

^ ^ 7/"^^ "/V "^^ 
and since /(x) = is a convex function we also have that 

We are now ready to prove the following 
Proposition 1. There exists a function p{J, h) such that 

lim pj\f = p ■ 

Proof. Relations (|3.4p and (j3.5p imply that 

and this in turn gives 

Zr, = J^e-^^M ^ ^g-//^,{a:i..7Vi)-H^,K^i+i..iV2) ^ Zn.Zn, 

where a : l..A^i = {o"i, .., cjtvj } and a : Ni + 1..N = {(Ttvi+i, ••, cat}- Hence we have the 
following inequality 

NpN iVlPJVi + iV2PAr2 , for Ni+N2 = N 

This identifies the sequence {Np^} as a subadditive sequence, for which the following 
holds 

1- NpN . „ 

lim = lim vm = mfp/v. 

N-^oo N N^oo N 

Hence in order to verify the existence of a finite limit we need to verify that the sequence 
{pn} is bounded below, which follows from the boundedness of the intensive quantity 

= -^^^ - hm, 
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for — 1 ^ m ^ 1. Indeed, if 



H(a) 



N 



NK 



ln2 + i^ 



so the result follows. 



□ 



3.2.2 Existence by interpolation 

We shall now prove that our model admits a thermodynamic limit by exploiting an existence 
theorem provided for mean field models in [8]: the result states that the existence of the 
pressure per particle for large volumes is guaranteed by a monotonicity condition on the 
equilibrium state of the Hamiltonian. We therefore prove the existence of the thermody- 
namic limit independently of an exact solution. Such a line of enquiry is pursued in view of 
the study of models, that shall possibly involve random interactions of spin glass or random 
graph type, and that might or might not come with an exact expression for the pressure. 

Proposition 2. There exists a function p{J, h) such that 



Proof. Theorem 1 in [8] states that given a Hamiltonian Hjq such that is bounded in A^, 
and its associated equilibrium state w^v, the model admits a thermodynamic limit whenever 
the physical condition 



For the Curie- Weiss model the condition is easy to verify once we define partial magne- 
tizations 



lim pn = p ■ 



Ni + N2 = N, 



(3.6) 



is verified. 




This gives that 
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so that 



,2 , , AT /'^™2 , \ , AT ^'^„,2 



Hn-Hn,-Hn2 = -N{-m^ + hm)+Ni{-mi + hmi)+N2{-m^2 + hm2) = 
= —Jy—[m m, mr.) — JM him mi mo] 

atJ I 2 ^12 ^2 2x^n 

= -N-{m - —m^ - j^m^) ^ 

The last inequaUty follows from convexity of the function /(x) = x^, and since it holds 
for every configuration cj, it also implies (13. 6p . proving the result. 

□ 



3.3 Factorization properties 

In this section we shall prove that the correlation functions of our model factorize completely 
in the thermodynamic limit, for almost every choice of parameters. This implies that all the 
thermodynamic properties of the system can be described by the magnetization. Indeed, 
the exact solution of the model to be derived in the next section comes as an equation of 
state which, as expected, turns out to be the same as the balance equation derived by Weiss. 

Proposition 3. 

lim (wAr(m^) — t(Jjv(m)^) = 
Af— »oo 

for almost every choice of h. 

Proof. We recall the definition of the Hamiltonian per particle 

HN{cf) J 2 , 

^ = - hm, 

and of the pressure per particle 

(T 

By taking first and second partial derivatives of pat with respect to h we get 

dpN 1 AT ^ .e-^^"'^ d^PN , 2^ / ^2 

= ^ 2^iVm(o-) =u;N{m), = WAr(m ) - ujN[m) . 

By using these relations we can bound above the integral with respect to h of the 
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fluctuations of m in tlie Gibbs state: 



{uj]\f{ni ) — ujj\f{m) ) dh 



On tlie other hand we have that 



and 



1 


d^PN 


N 


JhW dh"^ 


1 

< — 
N 


'|a;Ar(m)L(2) 




dpN 
dh ' 




^dpN 
" 5J ' 



dh 



1 

N 



dpN 



dh 



(3.7) 



so, by convexity of the thermodynamic pressure p = hm pj\[, both quantities '^f and 

Af^oo oh 

dp 

have well defined thermodynamic limits almost everywhere. This together with ([3 
implies that 



lim {(jJm{w?') — uJis[{rrLf') = a.e. in h. 



(3.8) 
□ 



The last proposition proves that m(cj) is a self- averaging quantity, that is, a random 
quantity whose fluctuations vanish in the thermodynamic limit. This is indeed a powerful 
result, which can be exploited thanks to the following 

Proposition 4. (Cauchy-Schwartz inequality) Let X andY be two random variables defined 
on a finite probability space such that P{Xi) = PiYi) = pi. Then the following holds 



E(Xy) -E(X)E(y) ^ VVar(X)Var(y) 
Proof. Let us define the following quantities: 

E(X) = Xm = fix, Var(X) = 

i 

E{Y) = ^ Ym = /uy , Var(y) = 4 
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If we now define rescaled versions of X and Y: 



X = and Y = 

ax cry 

we get that {Xip^' } and {Yip-' } are vectors of Euclidean length equal to 1 (since their 
lengths are the variances of X and Y, which have been normalized). This implies 

mXY)\ = I ^ X^,pi\ = I ^ XipY^Y.py^l ^ 1 (3.9) 

i i 

where the inequality only points out that K{XY) is the projection of a unit vector against 
another, and therefore that its modulus is less than one. 

If we now substitute back X and Y into ()3.9p we get our result. 

□ 

By putting together the self-avering property and the Cauchy- Schwartz inequality we 
get the following 

Proposition 5. Given any integer k we have that 

lim (u;Ar(m^) - uJN{m)^) = 

for almost every choice of h. 

Proof. Applying the Cauchy-Schartz inequality to X = m'^"^ and y = m we get that 

\ujNim''~^'m) - u)N{m^~^)ujN{m)\ ^ Var at (m*^-i) Var m {m) . (3.10) 

Now self-averaging tells us that Var Ar(m) tends to zero in the limit, and since ni^~^ is 
a bounded quantity, (j3.10p implies: 

lim iujArim^) — ujM(m)^~^uJMim)] = 

and the rest of the proposition follows by induction on the same argument. 

□ 

The last proposition is very important for this model, because the mean-field nature of 
the system allows to use the factorization of the magnetization in order to prove factorization 
of spin correlation functions, thus characterizing all the thermodynamics of the system. 
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In the following proposition we shall only prove the factorization of 2-spins: the factor- 
ization of k-spins is done in the same way. 



Proposition 6. 



lim {uJNicTicrj) - uJNicri)oJN{crj)) = 



for almost every choice of h, whenever ai, aj are distinct spins. 

Proof. Now we can use the self- averaging of m(a) the factorization of correlation functions. 
This is done by exploiting the translation invariance of the Gibbs measure on spins, which 
in turn follows from the mean-field nature of the model: 

1 ^ 

UJN{m) = u;Ar(— ^CJi) = WAf(o-l), 
i=l 

^ N 1 1 ^ 

i,j=l i¥=j=i i=j=l 

N-1 . , 1 
-uJN[cricr2) + 



We have that (fHH]) and i^Bj imply 



(3.11) 



lim W7v(cicrj) — a;Ar((7i)u;Ar((T,) = 0, for a.e. h (3-12) 

Af— >oo 



which verifies our statement for all couples of spins i ^ j. 



□ 



The self-averaging of the magnetization has been proved directly here: this, however, 
can be seen as a consequence of the convexity of the pressure. Indeed, the second derivative 
of any convex function exists almost everywhere: this is a consequence of the first derivative 
existing almost everywhere and being monotonically increasing (se, e.g., |57j). 

Therefore existence almost everywhere of together with the intensivity property 
of the magnetization implies trivially that its fluctuations vanish in the thermodynamic 
limit. This also implies that, since energy per particle is another intensive quantity which is 
obtained by differentiating the pressure with respect J, energy per particle is a self-averaging 
quantity too. 

As we can see from Proposition [5] factorization of spins only holds a.e. for h, and indeed 
it can be proved that factorization doesn't hold at /i = 0, J > 1. However, by using 
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the self- averaging of energy-per-particle proved above, we can similarly obtain a weaker 
factorization rule which covers this regime: 



Proposition 7. 

lim uJNicTiCrjCrkcri) - a;Ar(crif7j)a;Ar(fJfciT/) = 0, for a.e. J 

N—*ca 

for almost every choice of J, whenever ai, aj, a^, cri are distinct spins. 

Proof. The proof follows the same argument of Proposition \5\ and uses the self-averaging 
of the energy per particle instead of the self-averaging of the magnetization. 

□ 



3.4 Solution of the model 

We shall derive upper and lower bounds for the thermodynamic limit of the pressure. The 
lower bound is obtained through the standard entropic variational principle, while the upper 
bound is derived by a decoupling strategy. 

3.4.1 Upper bound 

In order to find an upper bound for the pressure we shall divide the configuration space into 
a partition of microstates of equal magnetization, following [191 EZl EH] . Since the system 
consists of N spins, its magnetization can take exactly + 1 values, which are the elements 
of the set 

i?^ = {-l,-l + ^,..., 1-^,1}. 
Clearly for every m(cj) we have that 

where 6x,y is a Kronecker delta. Therefore we have that 

Zjv = ^exp{Ndm^ + hm)} = =^ ^ 5^,^ exp {iV(^m2 + /jm)}. (3.13) 



N 



Thanks to the Kronecker delta symbols, we can substitute m (the average of the spins 
within a configuration) with the parameter fh (which is not coupled to the spin configura- 
tions) in any convenient fashion. Therefore we can use the following relation in order to 
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linearize the quadratic term appearing in the Hamihonian 

(m — m)^ = 0, 

and once we've carried out this substitution into (|3.13|) we are left with a function which 
depends only linearly on m: 

Zat = ^ ^ (5m,mexp {Af(^(2mm - Tfi^) + /im)}. 
and bounding above the Kronecker deltas by 1 we get 



Zn ^ exp \^N{^{2mm — m^) + hm)} . 

Since both sums are taken over finitely many terms, it is possible to exchange the order 
of the two summation symbols, in order to carry out the sum over the spin configurations, 
which now factorizes, thanks to the linearity of the interaction with respect to the ms. This 
way we get: 



where 



Zn^ G{m). 



G{fh) = exp { - N-Jm^] ■ 2^(cosh(j?fi + h)) 



(3.14) 



Since the summation is taken over the range Rn cardinality + 1 we get that the 
total number of summands is + 1. Therefore 

Zn ^ (Af + l)supG, (3.15) 

which leads to the following upper bound for p^: 

PN = ^In^Tv ^ ^ln(iV + l)supG = 

iV iV m 

= ^ln(iV + 1) + ^suplnG . (3.16) 

iV iV m 
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where the last equahty follows from monotonicity of the logarithm. 
Now defining the N independent function 

Pup{iTT'i,iTT'2) = — In G = In 2 — ■^^^ + In cosh( Jm + /i), 
and keeping in mind that limAr_^oo jj ln{N + 1) = 0, in the thermodynamic limit we get: 

limsuppAT ^ suppi/p{fh). (3-17) 
We can summarize the previous computation into the following: 



Lemma 1. Given a Hamiltonian as defined in 113. 3\) . and defining the pressure per particle 
as pn = j^lnZ, given parameters J and h, the following inequality holds: 

limsuppAT ^ supp^p 



where 

Pup{in) = In 2 — -^"^^ + In cosh (Jm + h) , 
and fh G [—1, 1]. 

We shall give two ways of deriving a lower bound for the pressure: indeed, it is important 
to keep in mind that having as many bounding tecniques as possible can be a good way of 
approaching more refined models. 

3.4.2 Lower bound by convexity of the Hamiltonian 

Proposition 8. Given a Hamiltonian as defined in \3. 3|) and its associated pressure per 
particle pN = jf^^Z, the following inequality holds for every J, h: 

PN ^ sup pi 

ow 

-l<m<l 



where 



Plow{ni) = ——w? + ln2 + In cosh( Jm + h) 



Proof. We recall the Hamiltonian per particle written in terms of the configuration's mag- 
netization m{a): 

H{a) J 2 

— — — = m — nm. 

N 2 
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Now, given any number m € [—1, +1], the following holds: 



(m — m)^ ^ =^ ^ 2mm — 

so that 

PN = ^InZTv = -^ln^exp{iV(^?n2 + /im)} ^ 
^ — In exp{N{Jmfh fh"^ + hm)} = 



— In ^ exp{ —ffi^} exp{A^( Jyfim + hm)} 

a 

--,2 , 1 ^„(nN„„^rJ- , .^7V^ "^-2 



= --^'rn + ]^ 1" (^2^^ cosh( Jm + /i)^^ j = --^m"^ + ln2 + In cosh( Jm + h) 
This way we get new lower bound which can be expressed as 



PAT ^ sup pi 

ow 

-l<m<l 



where 



Plow (nT-) = ~ ■^'^^ ~'~ iV ^2^cosh( Jm + h)'^^ = In 2 — -^"^^ + In cosh( Jm + h) 
which is the result. 

□ 



3.4.3 Variational lower bound 

The second lower bound is provided by exploiting the well-known Gibbs entropic variational 
principle (see [58], pag. 188). In our case, instead of considering the whole space of ansatz 
probability distributions considered in [58], we shall restrict to a much smaller one, and 
use the upper bound derived in the last section in order to show that the lower bound 
corresponding to the restricted space is sharp in the thermodynamic limit. 

The mean-field nature of our Hamiltonian allows us to restrict the variational problem 
to a product measure with only one degree of freedom, represented by the non-interacting 
Hamiltonian: 

N 

H = -r^o-i, 

i=l 

and so, given a Hamiltonian H, we define the ansatz Gibbs state corresponding to it as 
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f{a) as: 

In order to facilitate our task, we shall express the variational principle of [58] in the 
following simple form: 

Proposition 9. Let a Hamiltonian H , and its associated partition function Z = 

(T 

he given. Consider an arbitrary trial Hamiltonian H and its associated partition function 
Z . The following inequality holds: 

\nZ ^\nZ -Oj{H) + Cj{H) . (3.18) 

Given a Hamiltonian as defined in and its associated pressure per particle pN = jj^^Z, 
the following inequality follows from (j3.18p ; 

liminfpAT ^ suppj^^ (3.19) 

N^oo fh 

where 

P'lojrh) = -m2 + /im-^-ln(^-)-^-ln(^-). (3.20) 
and fh ^ [—1,1]. 

Proof. The inequality (j3.18p follows straightforwardly from Jensen's inequality: 

^^(-H+ii) < ^^^-H+H^ _ (3 21) 

We recall the Hamiltonian: 

^(^) = -^5^^i^j-^I^^i, (3.22) 

i,j i 

SO that its expectation on the trial state is 

and a standard computation for the moments of a non-interacting system (i.e. for a perfect 
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^as) leads to 



CjiH) = -N{l-1/N)^{tanhrf - - Nhtanhr. 



(3.23) 



Analogously, the trial Gibbs state of H is: 

oo{H) = —Nr tanhr, 
and the non interacting partition function is: 

Z^ = ^e-^M=2^(coshr)^, 

which implies that the non-interacting pressure gives 

Pn = — In Zm = In 2 + In coshr 

So we can finally apply Proposition p.lSp in order to find a lower bound for the pressure 
PN = — IuZat: 

Pn = ^\^Zn^^ ^{H) + cD(#)) (3.24) 

which explicitly reads: 

= —\d.Zn ^ ln2 + In coshr + —(tanhr)^ + /i tanhr — r tanhr 
+J/2iV- J(tanhr)VA^- 

(3.25) 

Taking the liminf over N and the supremum in r of the left hand side we get (j4.2ip 
after performing the change of variables fh = tanhr, and obtaining the following form for 
the right hand side: 

J_2.,_ 1 — fh ^ ,1 — fh , 



Plow{m) = -m +hm ~^^''^^'> 2~^^^*^^~^" 



□ 
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3.4.4 Exact solution of the model 



We have derived two lower bounds and one upper bound to the thermodynamic pressure, 
which are given by the suprema w.r.t. fh of the fohowing functions: 

Pupim) = Plowim) = In 2 - -^m^ + In cosh( Jm + /i) 

/ / - \ J - 2 T- I + fh 1 + m 1 — fh 1 — fh 
Piowim) = -m +hm 2~^''^^^~) 2"^^^"-* ^^''^^^ 

Since pup = Plow: the supremum of this function gives the thermodynamic value of the 
pressure, and thus provides the exact solution to the model. However, it is important to 
verify that the bounds provided by all functions coincide, since for more general cases one 
of the bounding arguments may fail, as indeed happens in the next chapter, where a bound 
of type Plow cannot be found due to lack of convexity in the Hamiltonian. Furthermore, p'l^^ 
has a direct thermodynamic interpretation, as shall be explained in the following section. 

For the standard Curie- Weiss model that we are studying here the equivalence of the 
two bounds can be proved by way of a peculiar property of the Legendre transformation, 
and we will do this in this section. 

Proposition 10. The function 

■'^^^JV2 2 2 2 ^. 



is the Legendre transform of 



f{x) = — In 2 cosh( Jx + h) 



Proof. The Legendre transformation is defined by 

f*{y) = sup {xy - f{x)) 

X 

Since we are dealing with a convex function we can find the supremum by differentiation: 

— = y — tanh(Jx + h) = 
dx 

which implies 

Jx = arctanh y — h, 
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so that by substituting we find that the Legendre transform of / is 



f*{y) — y -^(arctanh y — h) — — In 2 cosh( arctanh y — h + h) = 

= V —arctanh y — — In 2 cosh arctanh y = 

I 1 + y yh 1 / 1 1 + y 1 1-y 

= 2/777 In T ~ 7^"^ i ^^Pio ^"^1 l + exp{-ln— — I 

2J 1 — y J J ^ 2 I — y 2 1 + y 

II + y yh l + y + 11 + y yh 1 
= y — - In 7 hi I r^=^= — 1 = y — - In -in 



2J 1-y J J V J "2J 1-y J J Vyf^y 

^ ''^^y. ln(l +y) + ^—^ ln(l - y) - y /i - In 2 



,2 



JV 2 ^ 2 

in 1 in yh] , 



JV 2 2 2 2 

which is the required result. 



□ 



We can similarly verify that the Legendre transform of g{x) = —\x^ is given by the 
function g*{x) = ^x^. 

This way we see that we can write the bounding functions as: 

Pupim) = piow{m) = J{f{fh) - g{m)), 

P'lojrh) = J{g*{m)-r{m)). (3.27) 

and the following proposition tells us that all of the bounds that we have found coincide. 

Proposition 11. Let f and g be two convex functions and f* and g* be their Legendre 
transforms. Then the following is true: 

sup f{x) - g{x) = sup g*{y) - f*{y) 

X y 

Proof. For a nice proof see [22], or the appendix in |40| . 

□ 

The last proposition tells us that both the variational principles we have derived provide 
the correct value for the thermodynamic pressure, and so the results of this section can be 
summarised in the following 



Theorem 1. Given a hamiltonian as defined in 113. 3]) . and defining the pressure per particle 
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as pn = Ji^^^^' 5^^^^ parameters J and h, the thermodynamic limit 

lim pj\f = p 

N^oo 

of the pressure exists, and can be expressed in one of the following equivalent forms: 
a) p = sup Pupirn) = sup piow{rn) 

m m 

h) p = sup p'i^y,{fh) 
m 

3.5 Consistency equation 

In the last section we have expressed the thermodynamic pressure of the Curie- Weiss model 
as the supremum of two distinct functions. Indeed, more can be said about this variational 
principle, since even the argument of the supremum has a very important meaning: we 
shall see in this section that, in case there is a unique supremum for pup = piow or p^^^, its 
argument gives the thermodynamic value of the magnetization. If there exists more than 
one supremum, we have a phase transition, and each argument gives a pure state for the 
magnetization. 

First, we point out the straight-forward fact that stationary points of both pup = piow 
and p'l^^ satisfy the condition: 

fh* =ianh{ J fh* + h), (3.28) 

which can be found in the literature as consistency equation, mean field equation, state 
equation, secularity equation, and other names, depending on the context. 

This equation is indeed important: since the bounding functions are smooth, and since it 
can be easily seen by checking derivatives that none of the admit suprema at the boundary of 
[— 1, 1], we have as a consequence that any supremum of the function satisfies this equation. 
It is also interesting to notice that the trivial fact that this equation has always a solution 
inside [—1, 1] can be also seen as a consequence of the existence results of Section [3. 2[ 

Proposition 12. Let J and h be given so that pup = Plow has a unique supremum, which 
is attained at m* . Then fh* = \\mN-^oo^N{fn) = limTv^oo '^^(cO- 
Proo/. The following holds at finite N, by definition of the pressure pnI J, h): 

dpN , X 



36 



We have proved that {pn} is a convergent sequence of functions which are convex (for 
a proof of the convexity of the pressure see [32], where convexity is proved for the free- 
energy in the Ising model, which is essentially the same as the pressure multiplied by — 1). 
This implies that the limit function is also convex, and as such it is differentiable almost 
everywhere. As a consequence we have the following: 

iim ujN[m) = lim — 



N^oo N^oo dh dh 

whenever the last derivative exists (for a proof that the limit of the derivatives coincides 
with the derivative of the limit in this case see [22] pag- 114). 

Therefore if we write ImiN^ooPN = p{J-, h, 'm*{J, h)), we can write the following: 

dsup^Piow dp{J,h,rh*{J,h) .drh* _^ i^fi-* , h\ , jdm* . .^j-* , ,n 

TT, = TT, = -J^rz-m + tanh( Jm + h) + J—— tanh{Jm + h), 

oh oh oh oh 

and by substituting (I3.28P we get 

dsupff. Plow 
dh 

which is our result. 



m 



□ 



A similar proposition can be proved analogously for p'l^w- write 
w(m) = lim LOj\[{m) and io{a,i) = lim uJN{o'i). 
As a consequence of Proposition [12] we have that we can write 

Piowim*) = S -U 

where 

^_ 1 + uj{ai) / 1 + ^i^i) \ 1 - ^(^i) i^f^~ ^(^» 



2 

is the thermodynamic entropy and 

U = —uj{'mf' + huj{m) 

is the thermodynamic internal energy, as can be derived directly from the Gibbs distribution. 
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3.6 A heuristic approach 



We shall now describe a heuristic procedure to obtain the consistency equation 13.281 First 
of all, we make the following observation about the Gibbs average uJi\i{cr]\i) of the magneti- 
zation: 

^ ae{-i,ir 

We now define the following Hamiltonian H^: 

J ^ Af 

Hpj = T > (TjCTi — hy Uj, 

2{N + l) * 
^ ' i,j=i i=i 

and its associated partition function 



^^£{-1,1} 



which allows us to write: 



u<7e{-l,l}'' 



E.e{_i,i}iv-i cosh(^ Zl'i' '^i + h + ^)e-^--i W 

cD;v(sinh(^E»=7'^» + ^ + F)) 
a}Ar(cosh(^ EilT^ o-j + /i + ^)) 

Now, if we assume that the last line implies 

y , \ V ^jv(sinh(J?n + /i)) 

Inn CJAT (cTj) = Inn — — — - (3.29) 

N-^oo N^oo uJj\f{cossi[Jm + n)) 

we can use the factorization properties of the model in order to derive the following. 

Let us consider ti;Ar(sinh(Jm + /i)), and write it by making the power series at the 
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u;Ar(smh( Jm + h)) = lon ( 



argument explicit: 

( Jm + h)'^ 

Now, if we consider only a partial sum up to n at the argument of the Gibbs state, and 
take the thermodynamic limit, the self-averaging property of the magnetization tells us that 
the following holds a.e. in J and h: 



k 

N^oo^ (2A; + 1)! ^ ' ■ ' 

fc=0 ^ ' 1=0 

7V->oo^ {2k + l)\ 

Now, disregarding convergence problems, the limit of (j3.30p together with the assump- 
tion (13.290 give the following equation: 

in* = tanh(Jm* + h), 



where fh* is the thermodynamic magnetization. This way we have derived heuristically the 
consistency equation describing the most important quantity for our model just by making 
use of the model's factorization properties. 

It is important, however, to stress that the procedure we proposed in this section is not 
mathematically rigorous: assumption ()3.29p . though sensible, hasn't been derived rigorously, 
and the possible convergent problems have not been considered. Nevertheless, since the 
procedure has provided the right answer which we have derived rigorously throughout the 
chapter, and since it consists simple considerations, it can be see as a way of approaching 
models defined on random networks instead that on the complete graph, which are not as 
well understood as the one treated in this chapter. 
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Chapter 4 



The Curie- Weiss model for many 
populations 

In this chapter we consider the problem of characterizing the equihbrium statistical me- 
chanics of an mean field interacting system partitioned into p sets of spins. The relevance of 
such a problem to social modelling is that such a partition can be made to correspond to the 
partition into classes of people sharing the same socio-economics attributes, as described in 
chapter [2j 

Our results can be summarised as follows. After introducing the model we show in 
section 3 that it is well posed by showing that its thermodynamic limit exists. The result 
is non-trivial, since sub-additivity is not met at finite volume. In section 4 we show that 
the system fulfills a factorization property for the correlation functions which reduces the 
equilibrium state to only p degrees of freedom. The method is conceptually similar to the 
one developed by Guerra in [35] to derive identities for the overlap distributions in the 
Sherrington and Kirkpatrick model. 

We also derive the pressure of the model by rigorous methods developed in the recent 
study of mean field spin glasses (see [37j for a review) . It is interesting to notice that though 
very simple, our model encompasses a range of regimes that do not admit solution by the 
elegant interpolation method used in the celebrated existence result of the Sherrington and 
Kirkpatrick model [36] . This is due to the lack of positivity of the quadratic form describing 
the considered interaction. Nevertheless we are able to solve the model exactly in section 
4.4, using the lower bound provided by the Gibbs variational principle, and thanks to a 
further bound given by a partitioning of the configuration space, itself originally devised in 
the study of spin glasses (see [371 ttH EH] ) • 

As in the classical Curie- Weiss model, the exact solution is provided in an implicit form; 
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for our system, however, we find a system of equations of state, which are coupled as well 
as trascendental, and this makes the full characterization of all the possible regimes highly 
non-trivial. A simple analytic result about the number of solutions for the two-population 
case is proved in section 14.51 

4.1 The Model 

We can generalize the Curie- Weiss model to p-populations, allowing r-body interactions 
with r = \..p. This gives rise to the following Hamiltonian: 

P P r 
Hn = -NY, E '^n,...,^!!"^^^' (4.1) 

r=l ii,...,ir=l k=l 

or, equivalently, to the following Utility function for individual i: 

p p r—l 

Ui = Y^ Ji,,...,i,_^,iYlrni^. 

r=l ji,...,i,._i=l k=l 

Here Ji^^...^i^ gives the interaction coefficients corresponding to the r-body interaction 
among individuals coming from populations ii, respectively. We can also consider the 
external fields to be already included in this form of the model, just by setting Jj = /ij. 
So we have defined interactions by using a tensor Ji^^...^i^ of rank r for each of the r-body 
interactions. 

4.2 Existence of the thermodynamic hmit for many popula- 
tions 

We shall prove that our model admits a thermodynamic limit by exploiting an existence 
theorem provided for mean field models in [8]: the result states that the existence of the 
pressure per particle for large volumes is guaranteed by a monotonicity condition on the 
equilibrium state of the Hamiltonian. Such a result proves to be quite useful when the 
condition of convexity introduced by the interpolation method [36^ [37] doesn't apply due 
to lack of positivity of the quadratic form representing the interactions. We therefore prove 
the existence of the thermodynamic limit independently of an exact solution. Such a line of 
enquiry is pursued in view of further refinements of our model, that shall possibly involve 
random interactions of spin glass or random graph type, and that might or might not come 
with an exact expression for the pressure. 
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Proposition 13. There exists a function p of all the parameters Ji^^,,,^i^ such that 



lim pn = p ■ 

The previous proposition is proved with a series of lemmas. Theorem 1 in [5] states 
that given a Hamiltonian i^jv and its associated equilibrium state ojm the model admits a 
thermodynamic limit whenever the physical condition 

ujn{Hn) ^ iON{HN,) + iON{HN2), Ni + N2 = N, (4.2) 

is verified. 

We proceed by first verifying this condition for an alternative Hamiltonian Hf^f, and 
then showing that its pressure pN tends to our original pressure pN sls N increases. We 
choose Hn in such a way that the condition (14. 2p is verified as an equality. 

Now, define the alternative Hamiltonian Hj^ as follows: 

Hn = -CN[[ — 2^ ^jl-(^jr 

1=1 ife=JV,^_i + l,...,]V,j^ 

JfeT^ih for '=7^'! 

where C is a real number. 

Though the notation is cumbersome at this point, the new Hamiltonian simply considers 
products of r distinct spins, ki of which are taken from population i (i.e. Yl^=i ~ 
so the combinatorial coefficient is just dividing the sum by the correct number of terms 
contained in the sum itself. 

Lemma 1. There exists a function p such that 

lim pj\f = p 

N^oo 

Proof. By linearity we have that 

u;n{Hn) = -CN n ^^y,^'^' E c^7v(cTji...a,-J = -CiVo.^(a,-,...ajJ, 

1=1 jfe=^ife-i+i.--.JVi^ 

(4.3) 

where, with a little abuse of notation, we let ctj^, ■■,o'j^, after the last equality be distinct 
spins taken from their own respective populations. The last equality hence follows from the 
invariance of with respect to permutations of spins belonging to the same population. 
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Equation (|4.2p implies trivially 

ujn{Hn - Hni - HN2) = 
for A''! + N2 = N, which verifies (j4.2p as an equality. 

□ 

The following two Lemmas show that the difference between Hi\f and Hjsf is thermody- 
namically negligible and as a consequence their pressures coincide in the thermodynamic 
limit. 

Though the notation is quite tedious, the proof is in no way different from the one 
described in [8]. We chose to keep full generality during this existence proof in order to 
show that the mean-field allows one to consider a whole range of possibilities for interaction, 
which might turn out useful for the modelling effort. 

Lemma 2. 

Hm = Hn + 0(1) (4.4) 

i.e. 

-,. Hn Hn 
lim — — = hm — — 

N^oo N N~^oo N 

Proof. We begin the proof by rephrasing the Hamiltonian in term of the spins, as follows: 

r=l ii,...,ir=l k=l 

= -E{ E ^^iiw'' ..ri"'.'"^^ 

r=l ii,...,ir=l k=l k=l 

= -E E |]^n— •^^i--^'- E ^ii-^>| = 

r=l h,...,ir=l k=l ifc=Afij^_i+l,...,Ar,j^ 

where 

P AT. 

1=1 

We only need to give details of the proof in the case only one of the coefhcients 
Jii,...,ir 7^ 0. The general case follows by summing up all the terms corresponding to 
non-zero interacting coefficients and noticing that, since this sum has only finitely many 
terms, the result still holds. 
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So we consider the following Hamiltonian 



r 1 1 



N'r-l 1± Q, 

k=l k=l jk=N,^^i+l,...,N,^ 



and we can lighten our notation by setting C = ^Jii,...,^, 

jk=Ni^_^+l,...,Nij^ 

Now, following [8] we divide the sum in two parts, as follows: 
C C * 

The first part is a sums only over products of distinct spins, whereas ^* is a sum of all 
products where at least two spins are equal. It is straightforward to show that 

C * 



so that we can rewrite H^q as follows: 

iiN = ^ Y a,,...aj^+0{l). 

ife=JVij,_l + l,...,Af,j, 

A straightforward calculation comparing H]\f and Hjsf can now check that 

Hn = Hn + 0{1), 

which is our result. 



□ 



Lemma 3. Say pN = — ln.^7v, and say hj\[{a) = — Define Z, and hj\[ in an 
analogous way. 
Define 

= \\hN — ^nW = sup {\h]\[{a) — h]\f{a)\} < cx). (4.5) 

<7G{-1, + 1}^ 
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Th 



en 



Proof. 



\PN -'Pn\ < \\hN - hnW . 



PN-PN = -lnZ^--lnZ^ = -ln^ 



AT e-^N{^) ^ ^^g-Ar(hjv(a)+fcjv) 

1 V e~^^^"^ 1 
= -TT In ^„ -_^ = — lne^'=^ = A:7v = ||/iiv-/iiv|| 

where the inequahty follows from the definition of in (j4.5p and from monotonicity of the 
exponential and logarithmic functions. The inequality for pj^ — Pn is obtained in a similar 
fashion. □ 

We are now ready to prove the main result for this section: 
Proof of Proposition The existence of the thermodynamic limit follows from our Lem- 
mas. Indeed, since by Lemma [1] the limit for pjv exists, Lemma [3] and Lemma [2] tell us 
that 



lim \pN — Pn\ liin ||/iAf — /iAr|| = 0, 



implying our result. 



□ 



4.3 Factorization properties 

From now on we shall restrict the model to include pair interactions only. Therefore, we 
have a Hamiltonian of the following kind: 

P J P 
Hn = -NY, -Ymmj - N^hiTUi, (4.6) 

ij'=l 1=1 

In this section we shall prove that the correlation functions of our model factorize com- 
pletely in the thermodynamic limit, for almost every choice of parameters. This implies that 
all the thermodynamic properties of the system can be described by the magnetizations rui 
of the p populations defined in Section 14. li Indeed, the exact solution of the model, to be 
derived in the next section, comes as p coupled equations of state for the mj. 
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Proposition 14. 



lim (uJNicTiaj) - UJNicri)uJNicrj)) = 



for almost every choice of parameters, where Gi, aj are any two distinct spins in the system. 

Proof. We recall the definition of the Hamiltonian 

p p 
Hm = -N ^ Jijmimj - N^hirm, 
i,j=l i=l 



and of the pressure per particle 



By taking first and second partial derivatives of pat with respect to hi we get 



dp 



N 



1 



dhi N 



-H{a) 

Zn 



ojN{mi), 



N{uJN{'mf) -UNijUif). 



By using these relations we can bound above the integral with respect to hi of the 
fluctuations of rrii in the Gibbs state: 



,(2) 



(1) 



{LdNimf) - LON{mif) dhi 



1 

iV 
1 



dhi 



1 


[^^ dpN 




iV 


Jh^''^ dh. 





dhi 

(|u;Ar(mi)| (2)| + |u;Ar(?ni)| (i)|) = 0{^). 



(4.7) 



On the other hand we have that 



dpN 
dhi ' 



and 



/ 2^ o^^^ 



dp 



so, by convexity of the thermodynamic pressure p = lim p^Vi both quantities and 

TV^oo dhi 

dp 

have well defined thermodynamic limits almost everywhere. This together with ()4.7p 
implies that 



lim {uJNirrii) — u!j\[{mi) ) = a.e. in hi, Ji^i 
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In order to prove our statement we shall write the magnetization mj in terms of spins 
belonging to the i*^ population, and then use the permutation invariance of the Gibbs 
measure: 



1 

t^Nirrii) = u;n{ _ — ^ at) = ujN{cri), 



^ ^ ^1/ ^ 1 AT 



Ni-Ni_i-l 1 



(4.9) 



We have that (jO]) and (ji^Sj) imply 

lim u;Ar(cJiCJ,) - a;Ar(o-j)u;Ar(cr,) = 0, (4.10) 

which verifies our statement for all couples of spins i ^ j belonging to the same population. 

Furthermore, by defining VarAr(mi) = (^uJNi'mf) — u;Ar(mj)^) for all populations i, we 
exploit (|4.8p . and use the Cauchy-Schwartz inequality to get 



|u;Ar(mjm,)— a;Ar(mj)u;Ar(m,)| ^ -i /Var(mj)Var(m,) — > a.e. in Jj^j, J, ,-, hi, hj (4.11) 

By using (j4.9p and (|4.11|) we can therefore verify statements which are analogous to 
(j4.10p . but which concern uj^iciCTj) where Uj and aj are spins belonging to different subsets. 
We have thus proved our claim for any couple of spins in the global system. 

□ 



4.4 Solution of the model 

We shall derive upper and lower bounds for the thermodynamic limit of the pressure. The 
lower bound is obtained through the standard entropic variational principle, while the upper 
bound is derived by a decoupling strategy. 
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4.4.1 Upper bound 

In order to find an upper bound for the pressure we shall divide the configuration space 
into a partition of microstates of equal magnetization, following |19l [37} I38j . Since each 
population g consists of Ng spins, its magnetization can take exactly Ng + 1 values, which 
are the elements of the set 

Rj^ =/_l 1 + J_ 1 —a\. 

Clearly for every mg{a) we have that 

rhgGRNg 

where 6x,y is a Kronecker delta. This allows us to rewrite the partition function as follows: 
Zn = XI i Y XI '^iJ'^i'^j + ^ X ^»"^»} = 

P ^ P P 

= X X ]^^rng,mgex];){— Jijmimj + N'^hiiJii]. (4.12) 

o" fhg&Rf^g g=l i,j=l i=l 

Thanks to the Kronecker delta symbols, we can substitute rrii (the average of the spins 
within a configuration) with the parameter rhi (which is not coupled to the spin configura- 
tions) in any convenient fashion. 

Therefore we can use the following relations in order to linearize all quadratic terms 
appearing in the Hamiltonian 

(nii — rfii)^ = Vi, 
{nii - fhi){mj - rhj) = Vz / j, . 

Once we've carried out these substitutions into (|4.12|) we are left with a function which 
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depends only linearly on the mj: 

P jy P P 

o- 'ig rhgeRffg g=l i,j=l i=l 

p ^ P P 

= ^ ^ Smg,mg exp {y ^ Jij{mifhj + rhimj - fhirhj) + N^^hirrii] = 

P N N ^ 

= X] X] n ^^s,^3 { ~ y X] Jijfhifhj + y Jijimifhj + frunij) + 

o- "ig fhg&B.Ng 9=^ «J=1 ij=l 

P 

i=l 

and bounding above the Kronecker deltas by 1 we get 



AT ^ N ^ ^ 

o- '^grh.geB.Ng i,j=i «J=1 *=1 

(4.13) 

As observed many times by Guerra [S^, since both sums are taken over finitely many 
terms, it is possible to exchange the order of the two summation symbols, in order to carry 
out the sum over the spin configurations, which now factorizes, thanks to the linearity of 
the interaction with respect to the rUg. This way we get: 

Zn ^ ^ G{mi,...,fhp). 

Vg ThgeRNg 

where 



G = exp{-^f: j.,m.^,i.n2^^(cosh(x:^^^^ 

(4.14) 



i,j=l j=l i=l 



where 

^ N 

Since the summation is taken over the ranges -Rat , of cardinality Ng + 1, we get that 
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the total number of terms is JJ(-^g + !)• Therefore 

5=1 



Zn < 11(^9 + 1) sup G, (4.15) 

which leads to the following upper bound for pjy: 

1 ^ I 1 

PN = T7lnZ;v^ J^^ln(iV^, + l) + ^ln _sup_ G. (4.16) 



mi,...,mp 



9=1 

Now defining the N independent function 

1 1 ^ ^ ^ J J. T h 

pup = —lnG = ln2 --} Jijmimj + } aj in coshl} -mi + —], 

(4.17) 



i,j=l j=l i=l 



where 

^ iV 

the thermodynamic limit gives: 

limsuppAT ^ sup VUP- (4-18) 

N—*oo fill,..., rhp 

We can summarize the previous computation into the following: 

Lemma 4. Given a Hamiltonian as defined in ^4^> ^^'^ defining the pressure per particle 
as pn = jjlnZ , given parameters Jij and hi, the following inequality holds: 

limsuppAT ^ sup Pup 

N^oo fill,..., ffip 



where 



1 ^ p ^ J. . _l_ J. . /j. 
PUP = In 2 - - ^ Jijrhiifij + ^ aj In cosh( ^ ^ + ^) , (4.19) 

2J = 1 j=i *=i 



an(i mj G [—1, 1] . 
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4.4.2 Lower bound 



The lower bound is provided by exploiting the well-known Gibbs entropic variational prin- 
ciple (see [58], pag. 188). In our case, instead of considering the whole space of ansatz 
probability distributions considered in [58], we shall restrict to a much smaller one, and 
use the upper bound derived in the last section in order to show that the lower bound 
corresponding to the restricted space is sharp in the thermodynamic limit. 

The mean-field nature of our Hamiltonian allows us to restrict the variational prob- 
lem to a p-degrees of freedom product measures represented through the non-interacting 
Hamiltonian: 

Ni N1+N2 N 

H = -ri'^ai - r2 ^ -Tj + ... - rp ^ cJi, 
i=l i=Ni+l 

and so, given a Hamiltonian H, we define the ansatz Gibbs state corresponding to it as 
fia) as: 

In order to facilitate our task, we shall express the variational principle of [58] in the 
following simple form: 

Proposition 15. Let a Hamiltonian H, and its associated partition function Z = e~^ 

cr 

be given. Consider an arbitrary trial Hamiltonian H and its associated partition function 
Z. The following inequality holds: 

InZ ^\YiZ -Cj{H) + Cj{H) . (4.20) 

Given a Hamiltonian as defined in US. 1\) and its associated pressure per particle pN = In 
the following inequality follows from (j4.20p ; 



liminfpAT ^ sup Plow (4-21) 

N~*OD rhi,...,fhp 

where 

^ p p p 

9 X] Jg,kmgmk + '^hgmg + ^agS{mg), (4.22) 



Plow ^ 



g,k=l g=l g=l 
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the function S{mg) being the entropy 

Sim,) = -^^ln(^^) - ^^ln( 

and fhg G [—1, 1]. 



1 + nig 1 + rUg 1 — rUg 1 — nig , 



Proof. The (j4.20p follows straightforwardly from Jensen's inequality: 

^^i-H+H) < . (4.23) 

The Hamiltonian (14. 6p can be written in term of spins as: 



E --.}-t{^E-.>,; (4.24) 



where contains the labels for spins belonging to the g^'^ subpopulation, that is 

9-1 9-1 9 

Pg = iVfc + 1, ^ iVfe + 2, Nk} 

k=l k=l k=l 

indeed its expectation on the trial state is 

^(«) = -^E{^ E ^(....)}-E{^E^(-)} i^-^) 

g,k=l 3 « jgp^^ j^p^ g=i 9 jgpg 

and a standard computation for the moments leads to 



^{H) = -— ^(1 - -— )Jg,3(tanhrg)2 - — ^ J^^fc tanhr^ tanhrfe 

^5=1 ^ ^g=l«9'^9,9 ^ g^fc=l 

—N /ig tanh r^. 

9=1 

(4.26) 

Analogously, the Gibbs state of H is: 

p 

Cj{H) = — A^^^a^Tg tanhr^, 

9=1 
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and the non interacting partition function is: 

Zj, = ^e-^W = f]2^«(coshrg)^« 

which imphes that the non-interacting pressure gives 

1 ~ ^ 
pN = — In Z]\f = hi 2 + Og In cosh Vg 

9=1 

So we can finahy apply Proposition (j4.20p in order to find a lower bound for the pressure 
PN = — IuZat: 

PN = ^lnZN>^ [inZN- Cj{H) + cD(#)) (4.27) 
which explicitly reads: 

1 ^ 
Pn = — 111 Z]\f ^ In 2 + Og 111 cosh rg + (4.28) 

9=1 



+ - Jg^fc tanh tanh Tfc + /ig tanh Tg (4.29) 

3,fc=l 9=1 

p 

— ag Tg tanh Vg + 

9=1 

+ ("■30) 

^-'^ g=l "9 g^l "9-^9,9 

(4.31) 

Taking the lim inf over N and the supremum in the variables rg the left hand side we 
get the (j4.2ip after performing the change of variables rhg = tanh Vg . 

□ 

4.4.3 Exact solution of the model 

Though the functions Plow and pup are different, it is easily checked that they share the 
same local suprema. Indeed, if we differentiate both functions with respect to parameters 
rhg, we see that the extremality conditions are given in both cases by the Mean Field 
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Equations: 

m, = tanh(X::^4±^m,, + M g = l..p (4.32) 

2ag agJ 

If we now use these equations to express tanh~^ as a function of rui and we substitute 
back into pup and plovk we get the same function: 

1 ^ ^ I l-fn? 
p = - - ^ Jg^kmgiTik - ^ - In ^ . (4.33) 

3,fc=l 3=1 

Since this function returns the value of the pressure when the vector (mi, ..,rhp) corre- 
sponds to an extremum, and this is the same both for plow and pup, we have proved the 
following: 

Theorem 1. Given a hamiltonian as defined in Ii4.6\ ), and defining the pressure per particle 
as pn = — InZ, given parameters Jij and hi, the thermodynamic limit 

lim pj\f = p 

of the pressure exists, and can be expressed in one of the following equivalent forms: 

a) p= sup PLOW 

mi,..,mp 

b) p= sup PUP 

mi,..,mp 

4.5 An analytic result for a two-population model 

The form we derived for the pressure can be rightfully considered a solution of the statistical 
mechanical model, since it expresses the thermodynamic properties of a large number of 
particles in terms of a finite number of parameters. 

Nevertheless, the equations of state cannot be solved explicitly in terms of the parame- 
ters: indeed, even the phase diagram for the two-population case has only been characterised 
fully in a subset of our parameter space, in which it has been found useful for a few physical 
applications \13\ I46| . This gives us a feeling of how the mean field assumption, being 
simplistic from one point of view, can given rise to models exhibiting non-trivial behaviour. 

In this section we shall focus on the two-population case, which is the case considered in 
the applications of the next chapter, and find an analytic result concerning the maximum 
number of equilibrium states arising from our equations of state. In particular we shall 
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prove that, for any choice of the parameters, the total number of local maxima for the 
fmiction p{rhi,m2) is less or equal to five. 

By applying a convenient relabelling to the model's parameters, we get the mean field 
equations for our two-population model in the following form: 



and correspond to the stationarity conditions of p{mi,rh2). So, a subset of solutions to this 
system of equations are local maxima, and some among them correspond to the thermody- 
namic equilibrium. 

These equations give a two-dimensional generalization of the Curie- Weiss mean field 
equation. Solutions of the classic Curie- Weiss model can be analysed by elementary ge- 
ometry: in our case, however, the geometry is that of 2 dimensional maps, and it pays to 
recall that Henon's map, a simingly harmless 2 dimensional diffeomorhism of M^, is known 
to exhibit full-fledged chaos. Therefore, the parametric dependence of solutions, and in 
particular the number of solutions corresponding to local maxima of p{mi,m2), is in no 
way apparent from the equations themselves. 

We can, nevertheless, recover some geometric features from the analogy with one- 
dimensional picture. For the classic Curie- Weiss equation, continuity and the Intermediate 
Value Theorem from elementary calculus assure the existence of at least one solution. In 
higher dimensions we can resort to the analogous result, Brouwer's Fixed Point Theorem, 
which states that any continuous map on a topological closed ball has at least one flxed 
point. This theorem, applied to the smooth map R on the square [—1, 1]^, given by 



establishes the existence of at least one point of thermodynamic equilibrium. 

We can gain further information by considering the precise form of the equations: by 
inverting the hyperbolic tangent in the first equation, we can rhi as a function of ?fi2, and 
vice-versa for the second equation. Therefore, when J12 7^ we can rewrite the equations 
in the following fashion: 




Ri{mi, 7712) 
R2{mi, 7712) 



tanh(Jiiar?7i + Ji2(l — 0)7712 + hi) 
tanh(Ji2a77ii + J22(l - a)m2 + /i2) 



< 



7n,2 = 



— — -(tanh mi — Juami — hi) 

Ji2(l - a) 



(4.34) 



mi = 



— (tanh 7772 - J22(l - a)"^2 - /i2) 

J 12 a 
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Consider, for example, the first equation: this defines a function m2(?Tii), and we shall 
call its graph curve 71. Let's consider the second derivative of this function: 

5^7712 1 2mi 

dfh\ " ~Ji2(l - a) ' (1 -m2)2- 

We see immediately that this second derivative is strictly increasing, and that it changes 
sign exactly at zero. This implies that 71 can be divided into three monotonic pieces, each 
having strictly positive third derivative as a function of fhi. The same thing holds for 
the second equation, which defines a function 7711(7712), and a corresponding curve 72. An 
analytical argument easily establishes that there exist at most 9 crossing points of 71 and 72 
(for convenience we shall label the three monotonic pieces of 71 as /, // and ///, from left 
to right): since 72, too, has a strictly positive third derivative, it follows that it intersects 
each of the three monotonic pieces of 71 at most three times, and this leaves the number of 
intersections between 71 and 72 bounded above by 9 (see an example of this in Figure [4T]) . 

By definition of the mean field equations, the stationary points of the pressure corre- 
spond to crossing points of 71 and 72. Furthermore, common sense tells us that not all of 
these stationary points can be local maxima. This is indeed true, and it is proved by the 
following: 

Proposition 16. The function p{rhi, 1712) admits at most 5 maxima. 

To prove [16] we shall need the following: 

Lemma 5. Say Pi and P2 are two crossing points linked by a monotonic piece of one of 
the two functions considered above. Then at most one of them is a local maximum of the 
pressure ^(7711,7772). 

Proof of Lemma \^ The proof consists of a simple observation about the meaning of our 

curves. The mean field equations as stationarity conditions for the pressure, so each of 71 

and 72 are made of points where one of the two components of the gradient of ^(777-1,7772) 

vanishes. Without loss of generality assume that Pi is a maximum, and that the component 

dp 

that vanishes on the piece of curve that links Pi to P2 is . 

ami 

Since Pi is a local maximum, ^(7771,7712) locally increases on the piece of curve 7. On 
the other hand, the directional derivative of ^(7771,7772) along 7 is given by 

t • Vp 

where t is the unit tangent to 7. Now we just need to notice that by assumptions for any 
point in 7 t lies in the same quadrant, while Vp is vertical with a definite verse. This 
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implies that the scalar product giving directional derivative is strictly non-negative over all 
7, which prevents P2 form being a maximum. 

□ 

Proof of Proposition The proof considers two separate cases: 

a) All crossing points can be joined in a chain by using monotonic pieces of curve such 
as the one defined in the lemma; 

b) At least one crossing point is linked to the others only by non-monotonic pieces of 
curve. 

In case a), all stationary can be joined in chain in which no two local maxima can be 
nearest neighbours, by the lemma. Since there are at most 9 stationary points, there can 
be at most 5 local maxima. 

For case b) assume that there is a point, call it P, which is not linked to any other point 
by a monotonic piece of curve. Without loss of generality, say that P lies on / (which, we 
recall, is defined as the leftmost monotonic piece of 71). By assumption, / cannot contain 
other crossing points apart from P, for otherwise P would be monotonically linked to at 
least one of them, contradicting the assumption. On the other hand, each of // and III 
contain at most 3 stationary points, and, by Lemma EJ at most 2 of these are maxima. So 
we have at most 2 maxima on each of // and ///, and and at most 1 maximum on /, which 
leaves the total bounded above by 5. The cases in which P lies on //, or on are proved 
analogously, giving the result. 

□ 
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Figure 4^.1: The crossing points correspond to solutions of the mean field equations 
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Chapter 5 

Case studies 



In previous chapters we defined a model which, generahzing well known tools from econo- 
metrics, provides a viable approach to study phenomena of human interaction. Its well- 
posedness as an equilibrium statistical mechanical model, proved in the last chapter, though 
supporting the idea that modelling social phenomena working from the bottom upo may be 
feasible, doesn't imply the relevance of the proposed tool to any actual scenario: indeed, 
for any model such relevance may only be established as a result of success in describing, 
and most importantly predicting events from the real world. 

There are many possible instances from the social sciences to which quantitative mod- 
elling is an appealing prospective. Due to the increasingly global nature of human mobility, 
one particularly timely social issue is immigration. The applicability of our model to immi- 
gration matters was considered in References |16] and [T7]. Reference [T7] analyses how the 
microscopic assumptions of the model reflect the tendency of individuals to act consistently 
with their cultural legacy as well as with what they identify as their social group, which are 
both tenets in the field of social psychology. The numerical analysis carried out in Refer- 
ence [16] shows how such simple assumptions are enough for the model to identify regimes 
in which a global change in a cultural trait is triggered by a small fraction of immigrants 
interacting with a large population of residents. 

The descriptive power shown by the model in the case of immigration further supports 
the view that equilibrium statistical mechanics can play a role in a quantitative theory of 
social phenomena. However, though qualitatively inspiring, the immigration scenario seems 
ill-suited as a first quantitative case study, due to the intrinsic difficulty of finding a database 
that characterizes such a social issue adequately. We therefore turn to the problem of giving 

^ that is, starting from individual interactions and trying to establish patterns that might be at work on 
a larger scale 
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our model a first implementation on some "simpler" matters. 

The aim of this chapter is two-fold. On one hand we are interested in assessing the 
simplest instance of the model considered in the last chapter, that is a mean field model 
where the population has been partitioned into two groups, based on their geographical 
residence, so that the model generalizes a discrete choice model with one binary attribute. 

On the other hand, we'd like to propose two simple procedures of model estimation, 
that we feel might be very appealing for models at an early stage of development. The first 
procedure is statistical in nature, and it's based on a method developed by Berkson [7], 
whereas the second takes a statistical mechanical perspective by considering the role played 
by the fluctuations of the main observable quantities for the model. 



5.1 The model 

We consider a population of individuals facing with a "YES/NO" question, such as choosing 
between marrying through a religious or a civil ritual, or voting in favor or against of death 
penalty in a referendum. We index individuals by i, i = 1...N, and assign a numerical value 
to each individual's choice <Tj in the following way: 



<7,; 



+1 if i says YES 
— 1 if i says NO 



Consistently with the many population Curie- Weiss model analysed in the last chapter, 
which as we saw generalises the multinomial logit model described in chapter [21 we assume 
that the joint probability distribution of these choices is well approximated by a Boltzmann- 
Gibbs distribution corresponding to the following Hamiltonian 

N N 



i,l=l 1=1 

Heuristically, this distribution favours the agreement of people's choices ai with some 
external influence hi which varies from person to person, and at the same time favours 
agreement of a couple of people whenever their interaction coefficient Ju is positive, whereas 
favors disagreement whenever Ju is negative. 

Given the setting, the model consists of two basic steps: 

1) A parametrization of quantities Ju and of hi, 

2) A systematic procedure allowing us to "measure" the parameters characterizing the 
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model, starting from statistical data (such as surveys, polls, etc). 



The parametrization must be chosen to fit as well as possible the data format available, 
in order to define a model which is able to make good use of the increasing wealth of data 
available through information technologies. 



5.2 Discrete choice 

Let us first consider our model when it ignores interactions Jjj = V i, / G (1, A^), that 
is 

N 
i=l 

The model shall be applied to data coming from surveys, polls, and censuses, which 
means that together with the answer to our binary question, we shall have access to infor- 
mation characterizing individuals from a socio-economical point of view. We can formalize 
such further information by assigning to each person a vector of socio-economic attributes 

r (1) (2) 



where, for instance. 



and 



1 for i Male 
for i Female 



a. 



(2) j 1 for i Employee 

I for i Self-employed 



etc. 

As we have seen in chapter [2l the general setting of the multinomial logit allows to 
exploit the supplementary data by assuming that hi (which is the "field" influencing the 
choice of i) is a function of the vector of attributes aj. Since for the sake of simplicity we 
choose our attributes to be binary variables, so that the most general form for hi turns out 
to be linear 

k 

/ij = ^ Oj-aP + ao 

and the model's parameters are given by the components of the vector a = {ao, ai, Ofc}. 
It's worth pointing out that the parameters aj, j = 0...k do not depend on the specific 
individual i. 
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We know that discrete choice theory holds that, when making a choice, each person 
weights out various factors such as his own gender, age, income, etc, as to maximize in 
probabiUty the benefit arising from his/her decision. Parameters a tell us the relative 
weight (i.e. their relative importance importance) that the various socio-economic factors 
have when people are making a decision with respect to our binary question. The parameter 
ao does not multiply any specific attribute, and thus it is a homogeneous influence which is 
felt by all people in the same way, regardless of their individual characteristics. A discrete 
choice model is considered good when the parametrized attributes are very suitable for 
the specific choice, so that the parameter ao is found to be small in comparison to the 
attribute-specific ones. 

We have shown in chapter [2] that elementary statistical mechanics gives us the probability 
of an individual i with attributes Oj answering "YES" to our question as: 



which as we saw is equivalent to the result obtained by applying economics' utility maxi- 
mization principle to a random utility with Gumbel disturbances. Therefore collecting the 
choices made by a relevant number of people, and keeping track of their socio-economic 
attributes, allows us to use statistics in order to find the value of a for which our distri- 
bution best fits the real data. This in turn allows to assess the implications on aggregate 
behavior if we apply incentives to the population which affect specific attribute, as can be 
commodity prices in a market situation. 

5.3 Interaction 

The kind of model described in the last section has been successfully used by econometrics 
for the last thirty years [50], and has opened the way to the quantitative study of social 
phenomena. Such models, however, only apply to situations where the functional relation 
between the people's attributes a and the population's behavior is a smooth one: it is ever 
more evident, on the other hand, that behavior at a societal level can be marked by sudden 
jumps [MlEIlliT]. 

There exist many examples from linguistics, economics, and sociology where it has been 
observed how the global behaviour of large groups of people can change in an abrupt manner 
as a consequence of slight variations in the social structure (such as, for instance, a change 



Pi 



P{(Ti = 1) 



gi^i _|_ g il-i 



k 




i=i 
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in the pronunciation of a language due to a little immigration rate, or as a substantial 
decrease in crime rates due to seemingly minor action taken by the authorities) [3l \3T\ Wl\ . 
From a statistical mechanical point of view, these abrupt transitions may be considered as 
phase transitions caused by the interaction between individuals, and this is what led us to 
consider in this thesis the interesting mapping between discrete choice econometrics and 
the Curie- Weiss theory, first stated in |21| . 

We then go back to studying the general interacting model 

N N 

HN{cr) = - ^ JaCTiCri - ^ hai, (5.1) 
1,1=1 1=1 

while keeping 

k 

hi = ^Oj-aP + ctQ. 
i=i 

We now need to find a suitable parametrization for the interaction coefficients Ju. Since 
each person is characterized by k binary socio-economic attributes, the population can be 
naturally partitioned into 2^^' subgroups, so that using the mean-field assumptions allows 
one to rewrite the model in terms of subgroup-specific magnetizations nig, as in the general 
Hamiltonian ()4.ip . Equation (14. ip is general enough to consider populations with different 
relative sizes (such as one in which residents make up a much larger share of population than 
immigrants): nevertheless, it turns out that the mean- field assumption implies a relation of 
direct proportionality between interaction coefficients and population sizes, that might be 
considered innatural. 

The approach taken in this thesis, therefore, is to consider sub-populations of comparable 
size, and model them in the thermodynamic limit as having equal size. In specific, in all 
cases we divide the data into two geographical regions which have a similar population. This 
"equal size" assumption can be considered as part of the modelling process: by using it to 
analyze data, as we do here, we can gain insights on how to relax it in future refinements of 
the model. So, for the time being, let Ju depend explicitly on a partition of sub-populations 
of equal sizes. By using the mean- field assumption we can express this as follows 

Ju = 2^-^99' ' Hi e g audi e g' , 
where g and g' are two sub-population (not necessarily distinct). This in turn allows us to 
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rewrite ()5.ip as 



N 
2^ 




( Jgg,mgmg, +Yhgmg) 



where mg is the average opinion of group g: 



1 



rUg = 



E 



i=(g-l)Ar/2'=+l 



We readily see how this is the many-population model considered in the previous chap- 
ter, and this gives us a solid microscopic foundation for the theory. Indeed, the results 
we obtained through relatively elementary mathematics establish rigourously the existence 
of the model's thermodynamic limit, as well as its factorization properties, and just as 
importantly provide us with a closed form for the thermodynamic state equations. 

Therefore if we are willing to test how well the model's assumptions compare with real 
data, we can use these equations as the main tool for a procedure of statistical estimation. 
Here we shall confront the simple case where k = 1. This is a bipartite model which, as we 
know from the last chapter, can have at most five metastable equilibrium states, given by 
the thermodynamically stable solutions to the following equations: 



Equation (j4.32p which was derived from the model's exact solution shows that the 
equilibrium state equations for a system consisting of two parts of equal size do not carry 
two different parameters J12 and J21, but that, even if these two parameters were different 
in the Hamiltonian, what characterizes each of the two subparts is rather their average 
(J12 + J2i)/2. We keep J12 and J21 as two distinct parameters throughout the statistical 
application in order to use them as a consistency test: we shall be able to consider systems 
to be in equilibrium only if J12 — J21 = 0. 

The state equations (j5.2p allow us, in particular, to write the probability of i choosing 
YES in a closed form, similar to the non-interacting one: 



mi = tanh(Jiimi + Ji2?7^2 + ^1) 
777,2 = tanh(J2imi + 322^2 + /12) 




(5.3) 




(5.4) 
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where 

2 

g'=l 

This is the basic tool needed to estimate the model starting from real data. We describe 
the estimation procedure in the next section. 

5.4 Estimation 

We have seen that according to the model an individual i belonging to group g has proba- 
bility of choosing "YES" equal to 

P'i' — JI 1 —IT 

where 

9' 

The standard approach of statistical estimation for discrete models is to maximize the 
probability of observing a sample of data with respect to the parameters of the model (see 
e.g. [6]). This is done by maximizing the likelihood function 

i 

with respect to the model's parameters, which in our case consist of the interaction matrix 
J and the vector a. 

Our model, however, is such that pi is a function of the equilibrium states m^, which in 
turn are discontinuous functions of the model's parameters. This problem takes away much 
of the appeal of the maximum likelihood procedure, and calls for a more feasible alternative. 

The natural alternative to maximum likelihood for problems of model regression is given 
by the least squares method [25] , which simply minimizes the squared norm of the difference 
between observed quantities, and the model's prediction. Since in our case the observed 
quantities are the empirical average opinions nig, we need to find the parameter values 
which minimize 

^{rhg - tanhUg)'^ , (5.5) 

9 

which in our case correspond to satisfying as closely as possible the state equations (15. 2p in 
squared norm. This, however, is still computationally cumbersome due to the non-linearity 
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of the function tanh(L'j). This problem has already been encountered by Berkson back in 
the nineteen- fifties, when developing a statistical methodology for bioassay [7]: this is an 
interesting point, since this stimulus-response kind of experiment bears a close analogy to 
the natural kind of applications for a model of social behavior, such as linking stimula given 
by incentive through policy and media, to behavioral responses on part of a population. 

The key observation in Berkson's paper is that, since Ug is a linear function of the 
model's parameters, and the function tanh(x) is invertible, a viable modification to least 
squares is given by minimizing the following quantity, instead: 



This reduces the problem to a linear least squares problem which can be handled with 
standard statistical software, and Berkson finds an excellent numerical agreement between 
this method and the standard least squares procedure. 

There are nevertheless a number of issues with Berkson's approach, which are analyzed 
in [6], pag. 96. All the problems arising can be traced to the fact that to build (j5.6p . we 
are collecting the individual observations into subgroups, each of average opinion mg. The 
problem is well exemplified by the case in which a subgroup has average opinion nig = ±1: in 
this case arctanhm^ = — oo, and the method breaks down. However the event mg = ±1 has 
a vanishing probability when the size of the groups increases, so that the method behaves 
properly for large enough samples. 

The proposed measurement technique is best elucidated by showing a few simple concrete 
examples, which we do in the next section. 

5.5 Case studies 

We shall carry out the estimation program for real situations which correspond to a very 
simple case of our model. The data was obtained from periodical censuses carried out by 
Istali: since census data concerns events which are recorded in official documents, for a 
large number of people, we find it to be an ideal testing ground for our model. 

For the sake of simplicity, individuals are described by a single binary attribute charac- 
terizing their place of residence (either Northern or Southern Italy) and we chose, among the 
several possible case studies, the ones for which choices are likely to involve peer interaction 
in a major way. 

■^Italian National Institute of Statistics 




(5.6) 



a 
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The first phenomenon we choose to study concerns the share of people who chose to 
marry through a religious ritual, rather than through a civil one. The second case deals 
with divorces: here individuals are faced with the choice of a consensual/ non-consensual 
divorce. The last test we perform regards the study of suicidal tendencies, in particular the 
mode of execution. 

5.5.1 Civil vs religious marriage in Italy, 2000-2006 

To address this first task we use data from the annual report on the institution of marriage 
compiled by Istat in the seven years going from 2000 to 2006. The reason for choosing this 
specific social question is both a methodological and a conceptual one. 

Firstly, we are motivated by the exceptional quality of the data available in this case, 
since it is a census which concerns a population of more than 250 thousand people per year, 
for seven years. This allows us some leeway from the possible issues regarding the sample 
size, such as the one highlighted in the last section. And just as importantly the availability 
of a time series of data measured at even times also allows to check the consistency of the 
data as well as the stability of the phenomenon. 

Secondly, marriage is probably one of the few matters where a great number of individ- 
uals make a genuine choice concerning their life that gets recorded in an official document, 
as opposed to what happens, for example, in the case of opinion polls. 

We choose to study the data with one of the simplest forms of the model: individuals 
are divided according to only to a binary attribute a^^\ which takes value 1 for people 
from Northern Italy, and for people form Southern Italy. In the formalism of Section 2, 
therefore, the model is defined by the Hamiltonian 

HN{(y) = -^(Jll?Tlf + (Jl2 + </2l)W'l"l2 + J22"i2 + + /i2"l2), 
U (1) , 

and the state equations to be used for Berkson's statistical procedure are given by (j5.4p . 

Table 15.11 shows the time evolution of the share of men choosing to marry through a 
religious ritual: the population is divided in two geographical classes. The first thing worth 
noticing is that these shares show a remarkable stability over the seven-year period: this 
confirms how, though arising from choices made by distinct individuals, who bear extremely 
different personal motivations, the aggregate behavior can be seen as an observable feature 
characterizing society as a whole. 
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% of religious marriages, by year 



Region 


2000 


2001 


2002 


2003 


2004 


2005 


2006 


Northern Italy 
Southern Italy 


68.35 
81.83 


64.98 
80.08 


61.97 
79.32 


60.90 
79.02 


57.91 
76.81 


55.95 
76.52 


54.64 
75.46 



Table 5.1: Percentage of religious marriages, by year and geographical region 



4- year period 


Parameter 


2000-2003 


2001-2004 


2002-2005 


2003-2006 


ao 


-0.10 ± 0.42 


-0.16 ± 0.15 


-0.18 ± 0.10 


-0.13 ± 0.01 


ai 


0.20 ± 0.59 


0.20 ± 0.22 


0.16 ± 0.14 


0.14 ± 0.01 


Ji 


1.16 ± 0.41 


1.09 ± 0.16 


1.01 ± 0.11 


1.02 ± 0.01 


J2 


1.29 ± 0.89 


1.40 ± 0.33 


1.45 ± 0.21 


1.36 ± 0.01 


Jl2 


-0.21 ± 0.89 


-0.10 ± 0.33 


0.03 ± 0.21 


-0.01 ± 0.01 


J2I 


0.09 ± 0.41 


0.02 ± 0.16 


-0.01 ± 0.11 


0.01 ± 0.01 



Table 5.2: Religious vs civil marriages: estimation of the interacting model 



In order to apply Berkson's method of estimation, we choose gather the data into periods 
of four years, starting with 2000 — 2003, then 2001 — 2004, etc. Now, if we label the share 
of men in group g choosing the religious ritual in a specific year (say in 2000) by m'^^^^, 
we have that the quantity that ought to be minimized in order to estimate the model's 
parameters for the first period is the following, which we label X^: 

2003 2 

= J2 J^larctanhmf^^ - ^7f"^^ 

year=2000 g=l 
g'=l 

hg = aia^^ + aQ. 

The results of the estimation for the four periods are shown in Table [5121 whereas Table 
15.31 shows the corresponding estimation for a discrete choice model which doesn't take into 
account interaction. 

5.5.2 Divorces in Italy, 2000-2005 

The second case study uses data from the annual report compiled by Istat in the six years 
going from 2000 to 2005. The data show how divorcing couples chose between a consensual 
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4-year period 



Parameter 


2000-2003 


2001-2004 


2002-2005 


2003-2006 


ao 


0.67 ± 0.15 


0.63 ± 0.03 


0.61 ± 0.06 


0.58 zb 0.03 


ai 


-0.41 ± 0.1 


-0.43 ± 0.04 


-0.45 ± 0.08 


-0.46 ± 0.04 



Table 5.3: Religious vs civil marriages: estimation of the non-interacting model 



% of consensual divorces, by year 



Region 


2000 


2001 


2002 


2003 


2004 


2005 


Northern Italy 
Southern Italy 


75.06 
58.83 


80.75 
72.80 


81.32 
71.80 


81.62 
72.61 


81.55 
72.76 


81.58 
72.08 



Table 5.4: Percentage of consensual divorces, by year and geographical region 

and a non-consensual divorce in Northern and Southern Italy. As shown in Table 15.41 here 
too, when looking at the ratio among consensual versus the total divorces, the data show a 
remarkable stability. 

Again we gather the data into periods of four years and Table [531 presents the estimation 
of our model's parameters for the whole available period, while in Table 15.61 we show the 
corresponding fit by the non-interacting discrete choice model. 

We notice that the estimated parameters have some analogies with the preceding case 
study in that here too the cross interactions J12, J21 are statistically close to zero whereas 
the diagonal values Jn, J22 are both greater than one suggesting an interaction scenario 
characterized by multiple equilibria [28]. Furthermore, in both cases the attribute-specific 
parameter ai is larger than the generic parameter ao in the interacting model (Tables 2 and 
5), as opposed to what we see in the non-interacting case (Tables 3 and 6): this suggests 
that by accounting for interaction we might be able to better evaluate the role played by 
socio-economic attributes. 

5.5.3 Suicidal tendencies in Italy, 2000-2007 

The last case study deals with suicidal tendencies in Italy, again following the annual report 
compiled by Istat in the eight years from 2000 to 2007, and we use the same geographical 
attribute used for the former two studies. 

The data in Table 15.71 shows the percentage of deaths due to hanging as a mode of 
execution. The topic of suicide is of particular relevance to sociology: indeed, the very first 
systematic quantitative treatise in the social sciences was carried out by Emile Durkheim 



69 



4-year period 



Parameter 2000-2003 


2001-2004 


2002-2005 


ao 


0.02 


± 


0.06 


-0.08 


± 


0.01 


-0.07 


zb 


0.01 


ai 


-0.25 




0.08 


-0.22 


± 


0.01 


-0.23 


± 


0.01 


Ji 


1.59 


± 


0.14 


1.64 


± 


0.01 


1.66 




0.01 


J2 


1.16 


± 


0.06 


1.25 


± 


0.01 


1.25 




0.01 


J12 


-0.05 




0.06 


0.01 


± 


0.01 


0.00 




0.01 


J21 


-0.08 




0.14 


0.00 


± 


0.01 


-0.01 


± 


0.01 



Table 5.5: Consensual vs non- consensual divorces: estimation of the interacting model 



4- year period 



Parameter 


2000-2003 


2001-2004 


2002-2005 


ao 


0.41 ± 0.13 


0.48 ± 0.01 


0.480046 ± 0.01 


ai 


0.28 ± 0.18 


0.25 ± 0.02 


0.261956 ± 0.01 



Table 5.6: Consensual vs non- consensual divorces: estimation of the non-interacting model 

|20| . a founding father of the subject, who was puzzled by how a phenomenon as unnatural 
as suicide could arise with the astonishing regularity that he found. Such a regularity as 
even been dimmed the "sociology's one law" [56j . and there is hope that the connection to 
statistical mechanics might eventually shed light on the origin of such a law. 

Mirroring the two previous case studies, we present the time series in Table [5771 whereas 
Table 15.81 shows the estimation results for the interacting model, and Table 15.91 are the 
estimation results for the discrete choice model. Again, the data agrees with the analogies 
found for the two previous case studies. 



% suicides by hanging 



Region 


2000 


2001 


2002 


2003 


2004 


2005 


2006 


2007 


Northern Italy 
Southern Italy 


34.17 
37.10 


37.02 
37.40 


35.83 
37.34 


34.58 
38.54 


35.21 
34.71 


36.23 
38.90 


33.57 
40.63 


38.08 
36.66 



Table 5. 7: Percentage of suicides with hanging as mode of execution, by year and geographical region 
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4-year period 



Parameter 2000-2003 


2001-2004 


2002-2005 


2003-2006 


2004-2007 


ao 


0.01 ± 


0.02 


zb 


0.01 


0.01 


± 


0.01 


0.02 




0.01 


0.02 




0.01 


ai 


0.01 ± 0.01 


0.00 


± 


0.01 


0.00 


± 


0.01 


0.00 




0.01 


0.00 




0.01 


Ji 


1.09 ± 0.01 


1.09 


± 


0.01 


1.09 


± 


0.02 


1.10 




0.03 


1.09 




0.01 


■h 


1.06 ± 0.01 


1.08 




0.01 


1.08 


± 


0.01 


1.07 




0.01 


1.07 




0.01 


Jl2 


± 0.01 


0.00 




0.01 


0.00 


± 


0.01 


0.00 




0.01 


0.00 




0.01 


J2I 


± 0.01 


0.01 




0.01 


0.00 


± 


0.02 


0.01 




0.03 


0.01 




0.01 



Table 5.8: Suicidal tendencies: estimation of the interacting model 



4-year period 



Param. 


2000-2003 


2001-2004 


2002-2005 


2003-2006 


2004-2007 


ao 


-0.25 ± 0.02 


-0.27 ± 0.03 


-0.26 ± 0.03 


-0.24 ± 0.04 


-0.25 ± 0.05 


ai 


-0.05 ± 0.03 


-0.03 ± 0.04 


-0.04 ± 0.04 


-0.07 ± 0.06 


-0.04 ± 0.07 



Table 5.9: Suicidal tendencies: estimation of the non-interacting model 



5.6 A statistical mechanical approach to model estimation 

We shall now estimate our model parameters using a different approach, which makes 
explicit use of the time fluctuations of our main observable quantities rhi . This approach is 
not econometric, but typically statistical mechanical, in that it equates fluctuations observed 
over time with fluctuations of a system which is in an equilibrium which is defined by 
an ensemble of states rather than by a single state. The problem of retracing a model's 
parameters from observable quantities in this context has been referred to in the literature 
as the "inverse Ising problem" (see e.g. [64]). 
We start from the usual model 

HNi^y) = -^{Jiim\ + {Ji2 + J2i)rnim2 + J22'ni\ + himi + h2m2), (5.7) 

u (1) , 

hi = ttia- + Oq, 

and we shall analyze the data from our three case studies again using the model's state 
equations 

mi = tanh(Jii?fii -|- Ji2m2 + /ii), 

m2 = tanh(J2imi -I- J22m2 + /12), (5.8) 
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which, as we shah see, will now also provide us with the system's fluctuations as well as the 
average quantities. Just as in the last section, we choose to use two distinct parameters J12 
and J21 inside the state equations (|5.8|) instead of their average ^(Ji2 + J21) hi order to test 
for consistency. 

5.6.1 Two views on susceptibility 

The method presented here comes from an observation about quantity — — , which is called 

dhj 

rriiS susceptibility with respect to external field hi in physics, or mj's elasticity with respect 
to incentive hi in econometrics. 

TTij 

The two relevant points of view that make 7— such an interesting quantity are those 

dhj 

of statistical mechanics and thermodynamics. 
15.6.11 1 Statistical mechanics 

For statistical mechanics -— — is a quantity defined internally to the system. The following 

dhj 

formula clarifies this point: From (j5.7p 



dhj dhj 



2^mi{a) 1 = y {uJN{mimj) - LON{mi)ujN{'mj)) = dj. (5.9) 



The quantity ^ , which we shall refer to as Cij for notational convenience, is thus 



drrii 
dhj 

simply the amount of fiuctuations that we observe in quantities mj: if imagine the system 
as a closed box, and we imagine being inside such closed box, we can in principle measure 
Cij by studying the way rrii vary. 

15.6.11 2 Thermodynamics 

The second point of view is intrinsically different: for thermodynamics — — corresponds to 

dhj 

the response of the "closed box" mentioned in the last paragraph to an external infiuence 
given by a small change in the field hj. Differently from statistical mechanics, thermo- 
dynamics cannot provide us with this response's value a priori from observations, since it 
doesn't know any details of what is going on inside the box. Thermodynamics does tell us, 
however, that responses of the system to different influences , if the system is to obey to 
the thermodynamic law identified by state equations (j5.8p . 
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These interrelations can be made explicit by considering the partial derivatives of ([5 

d'm-2 ^ \ 



dull 
dhi 


= (1 


-m?)( 


( ^ dull 

dhi 


dfhi 
dh2 


= (1 


-ml){ 


( ^ drill 

y ' dh2 


dfh2 
dh2 


= (1 


-rnl){ 


( dfhi 

y-^'' dh2 


dfh2 
dhi 


= (1 


-rnl){ 


( dm2 
y-^'' dhi 



dm2 
dho 



dm2 



dm2\ 
'~dh^)' 



By relabeling di = (1 — m^) and using definition (j5.9p we can rewrite this system of 



equations as 



Ji Cll + J12 C12 = — 1, 



Ji C12 + J12 C22 



di 
£12 

di' 



C22 

J21 C12 + -^2 C22 = -, 1, 



J21 Cll + -h C12 



d2 
£12 

d2' 



This is linear in the Jij, and the former two equations are independent from the latter 
two, so that we can easily solve for the Jij using Cramer's rule. This together with the 
equations of state (j5.8|) allows us to express all the model parameters Jij and hi as functions 
of the observable quantities fhi and Cij, as follows: 

T - _ J 

J12 — 9~ — '^21, 



C11C22 - cfj 
(^-l)c22-|f 

C11C22 - cfg 



J 



22 



2 

12 

C11C22 - cfj 



hi = arctanh mi — Ji mi — J12 m2, 
/i2 = arctanh 7Ti2 — J12 mi — J2 m2- 
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In this case we see the consistency condition J12 = J21 fulfihed a priori. This tells us that, 
given a set of sub-magnetizations, together with its covariance matrix, our parametrized 
family contains one and only one model corresponding to it. As a consequence we can say 
that such model makes use of exactly the amount of information provided into the time 
series of standard statistics (i.e. means and covariances) of a poll-type database. 

Estimators for rhi and Cij from the time series data are straightforward to obtain, and 
we have gathered these statistics for our three case studies in Tables 15.101 15.121 and 15.141 
Given a time period T, which in our case shall correspond to a range of four consecutive 
years, we define estimators ?7ij(T) of rhi and Cij{T) of Cij corresponding to it 

year£T 

Cr.iT) = {rnr'--mm){rnf^'-rh,{T)). 

year£T 

(5.10) 

We must point out that in order to be well defined, such estimators should apply to a 
time series of samples which are of equal size, since susceptibility Cij has indeed an explicit 
size dependence. Our systems, on the other hand, cannot be of equal size since they consist 
of people who chose to participate into an activity, and the number of these people cannot be 
established a priori. As stated before, however, the point of view in this thesis is that human 
affairs can behave following the kind of quasi-static processes familiar to thermodynamics. 
Consistently with this perspective, and with some justification coming from the considered 
data, we shall consider the system's population a slowly varying quantity, and use its average 
of small periods of time as the quantity Nt in order to define Cij{T) 

yeardiT 

We can thus use relations ()5.10p in order to obtain estimates for the model parameters. By 
considering that 

ao = h2, (5.11) 
ai = hi — /i2, (5.12) 

we can compare the new estimates, presented in Tables [5. IH 15. 13) and 15. 15| with those from 
the preceding section. 
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5.6.2 Comments on results from the two estimation approaches 

We can now compare Tables 15.111 15.131 and 15.151 with their counterparts from last section, 
which estimated the same model for the same data coming from our three chosen case 
studies, using our adaptation of Berkson's method. 

Such comparison can be summarised as follows: comparing Table [5?TTt showing param- 
eter estimations for the "religious vs civil marriage" case study, with Table 15.21 we find the 
estimated values to be definitely different, but we also see that they bear some interesting 
similarities, especially if we consider the confidence interval provided by the least squares 
method in Table [5^21 Three shared features are particularly noteworthy: 

- The estimated values for Ji and J2 are similar in one aspect: in both cases J2 is 
estimated to be consistently greater than Ji over the years; 

- J12 is estimated to be very close to zero in Table ISTTT] J12 and J21 can be considered to 
be statistically zero in Table [5^2] (which is also consistent with the condition J12 — J21 = 

0); 

- ao and ai consistently estimated with equal signs by both methods: this is an essential 
prerequisite that any model needs to satisfy. 

The agreement is not good for the two remaining case studies, however. In the "con- 
sensual vs non-consensual divorce" case study, despite estimations being consistent in the 
first time range (that is 2000-2003), agreement gets worse and worse in the following two 
periods. As for the third case study, the two estimation methods do not show any agreement 
whatsoever. 

An important point to be made is the dependence of method agreement against popu- 
lation size. For the first case-study, where the population is made up of over 200 thousand 
people the agreement between the two methods is good. In the second case-study we have 
a population of roughly 40 thousand people, and we find agreement in one of the three 
considered time spans. The third case-study doesn't show any agreement: the population 
size here, however, is of only around 2000 people. 

Finally, though the last point certainly motivates further enquiry, one should not be 
over-confident about population size being the only problem. An extremely important 
objection comes from the fact that wherever agreement is found, estimators Cij are found 
to give very high values. We must remember that we are looking at the data through a 
model that assumes equilibrium: such big Cij values correspond to large fluctuations, and 
these should cause an equilibrium model to be less precise and not more. 
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The failure of the two estimation methods to give consistent results in regimes with small 
fluctuations (that is whenever Cij are small), reveal the presented study as inconclusive on 
an empirical level. There are however several improvements that can be made by using 
the same framework established here, the most important one concerning the handling of 
the data. This thesis has as its goal to propose both a model, and a procedure allowing to 
establish the empirical relevance of the model itself. It was hence of the foremost importance 
to show a concrete example of such a procedure; since this was not a professional work in 
statistics, however, it featured several drawbacks, some of which can be described as follows: 

- Though showing a remarkable temporal coherence, the time series consists of a number 
of measurements which is insufficient for any statistic to be reliable. In order to work 
on consistent groups of data, the choice was made to gather data in four-year ranges: 
the situation may be improved by considering a phenomenon having the same kind of 
temporal coherence, but for which measurements are available on a monthly basis; 

- The regional separation between "Northern Italy" and "Southern Italy" is an artificial 
one, decided for technical reasons. The quality of the statistical study could be greatly 
improved by considering a partition into groups which is directly relevant to the issue 
under study; 

- No use was made of the data regarding the relative sizes of the considered sub- 
populations. This, as noted before, was due to a difficulty arising from the mean- 
field assumption, which lead us to characterize the population as having equal size. 
This drawback can be amended in two ways: 1) at a fundamental level, by further 
considering the implications of having populations of different size for the model 2) 
by keeping the same model, but considering estimators for Cij that make use of the 
information coming from the subpopulation sizes. 

A final point to make concerns the model itself: very little is known about the structure 
of the phase diagram of a mean-field model of a multi-part system: indeed, as noted in 
earlier chapters, a subcase case of a two-part system considered here was studied in several 
occasions since the nineteen-fifties |33| [9] until recently |46j . and found to be highly non- 
trivial. As a consequence, it is to be expected that the analysis of the features characterizing 
the regime that empirical data identify will need to be treated locally and numerically before 
any kind of global picture arises, and it is not a priori clear whether the presence of big 
values for the Cij might characterize and interesting regime rather than just a failure of the 
model. It is mainly for this reason that much of the effort in this thesis has been directed 
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towards the aim of establishing a way to link the model to data, rather than to pursue 
further the analytic treatment of the model on its own. 



4-year period 


Statistic 


2000-2003 


2001-2004 


2002-2005 


2003-2006 


rhi 


0.25 


0.20 


0.15 


0.12 


m2 


0.58 


0.56 


0.54 


0.52 


Cll 


1636.09 


953.63 


466.42 


106.59 


C22 


346.88 


214.58 


122.02 


22.30 


C12 


562.15 


336.03 


176.09 


34.09 



Table 5.10: Religious vs civil marriages: statistics 



4-year period 



Parameter 2000-2003 2001-2004 2002-2005 2003-2006 





-0.21 


-0.18 


-0.15 


-0.10 


ai 


0.20 


0.17 


0.14 


0.08 


Ji 


1.07 


1.04 


1.02 


1.00 


J2 


1.51 


1.44 


1.39 


1.29 


J12 


0.00 


0.00 


0.01 


0.03 



Table 5.11: Religious vs civil marriages: estimation of the interacting model 



4-year period 



Statistic 


2000-2003 


2001-2004 


2002-2005 


mi 


0.59 


0.63 


0.63 


7712 


0.38 


0.45 


0.45 


Cll 


74.21 


1.27 


0.15 


C22 


356.27 


1.76 


1.68 


C12 


120.56 


-0.17 


0.28 



Table 5.12: Consensual vs non- consensual divorces: statistics 
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4-year period 



Parameter 


2000-2003 


2001-2004 


2002-2005 


ao 


-0.05 


0.23 


-0.71 


ai 


-0.17 


0.01 


5.81 


Ji 


1.51 


0.85 


-8.06 


J2 


1.16 


0.68 


0.39 


J12 


0.01 


-0.07 


1.61 



Table 5.13: Consensual vs non- consensual divorces: estimation of the interacting model 



4-year period 



Statistic 


2000-2003 


2001-2004 


2002-2005 


2003-2006 


2004-2007 


rhi 


-0.29 


-0.29 


-0.29 


-0.30 


-0.28 


1712 


-0.25 


-0.26 


-0.25 


-0.24 


-0.25 


Cll 


0.94 


0.62 


0.30 


0.71 


1.95 


C22 


0.23 


1.50 


2.05 


3.57 


3.66 


C12 


-0.08 


0.00 


0.11 


-0.50 


-0.91 



Table 5.14: Suicidal tendencies: statistics 



4-year period 



Parameter 


2000-2003 


2001-2004 


2002-2005 


2003-2006 


2004-2007 


ao 


-1.21 


-0.16 


-0.06 


-0.13 


-0.11 


ai 


0.81 


-0.29 


-0.87 


-0.37 


-0.08 


Ji 


-0.01 


-0.53 


-2.33 


-0.45 


0.51 


J2 


-3.40 


0.41 


0.57 


0.75 


0.76 


J12 


-0.39 


0.00 


0.19 


-0.22 


-0.14 



Table 5.15: Suicidal tendencies: estimation of the interacting model 
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